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BOX PATENT APPLICATION 



This is a request for filing a [X] Continuation [ ] Divisional Application 
under 37 CFR 1.60, of pending prior application Serial No. 08/279,058 filed 
on July 22, 1994 of Michael O'Donnell for DNA POLYMERASE III HOLOENZYME 



[X] Enclosed is a copy of U.S. Patent Application Serial No. 08/279,058, 
filed July 22, 1994. I hereby verify that the attached papers are a 
true copy of what is shown in my records to be the above identified 
prior application, including the oath or declaration originally filed 
(37 CFR 1.60) . 

The enclosed copy of the prior application as originally filed 
includes : 

105 page(s) of specification, claims and abstract 
10 sheet (s) of drawings 
0 pages of declaration and power of attorney 

If the copy of the declaration being filed does not show applicant's 
signature, complete the following: 

[X] In accordance with the indication required by 3 7 CFR 
60 (b) my records reflect that the original signed 
declaration showing applicant's signature was filed on 
October 15, 1994 (copy enclosed) . 

2 « Amendments ; 

[X] Cancel in this application original claims 2-4 of the prior 

application before calculating the filing fee. (At least one 
original independent claim must be retained for filing 
purposes . ) 

[ ] A preliminary amendment is enclosed. (Claims added by this 

amendment have been properly numbered consecutively beginning 
with the number next following the highest numbered original 
claim in the prior application.) 

[X] Amend the specification by inserting before the first line, the 
sentence: "This is a [X] continuation, [ ] division of 
application Serial number 08/279,058, filed on July 22, 1994." 
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3. 



The Piling Fee is Calculated Below: 



CLAIMS AS FILED IN THE PRIOR APPLICATION 
LESS ANY CLAIMS CANCELLED BY AMENDMENT BELOW 

SMALL ENTITY 



(Col. 1) (Col. 2} 


! 

|FOR: 


1 1 
NO. FILED | NO. EXTRA | 

1 1 


| 1 

| BASIC FEE 


xxxxxxxxxxxx | xxxxxxxxxxxx | 

1 1 


1 1 

| TOTAL CLAIMS 


1 - 20 - | 0 | 


1 1 

|INDEP CLAIMS 


1 1 1 

1 - 3 = | 0 | 

1 


1 

| [ ] MULTIPLE 
I 


DEPENDENT CLAIM PRESENTED | 

t 



*If the Total Claims are less than 2 0 
and Indep. Claims are less than 3, 
enter "0" in Col. 2 



LARGE ENTITY 



1 

| RATE 


l 

FEE | 


OR 


i 

| RATE 


1 

FEE | 


| XXXXXX 


$385 | 


OR 


| XXXXXX 


$770 | 


jx 11 = 


$ i 


OR 


jx 22 = 


$ 1 


jx 40 = 


$ 1 


OR 


jx 80 = 


$ 1 


jxl3G « 


$ 1 


OR 


|x260 = 
i 


$ 1 


TOTAL 


$385 | 
... 1 


OR 


TOTAL 


$ 1 
i 



5. 
6. 



Small Entity Status? 

[X] A verified statement that this filing is by a small entity: 
[ ] is attached 

[X] has been filed in the parent application (copy enclosed) 
and such status is still proper and desired (37 CFR 
1.28 (a) ) 

New [ 3 formal, [ 3 informal drawings are enclosed. 
Priority 35 U.S.C. 119: 
t 1 



(country) 
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Priority of application serial no. 0 / 

filed on in 

is claimed under 3 5 U.S.C. 119. 



The certified copy has been filed in prior U.S. Application 
Serial No. 0 / on 



[ ] The certified copy will follow. 
Assignments 

[X] The prior application is assigned of record to Cornell. Research 

Foundation , Inc. 
[ ] an assignment of the invention to is attached. 

Fee Payment: 

[ ] Not Enclosed 

[X] A check in the amount of $385 is enclosed. 



[3 A check in the amount of $_ 



to cover the 



assignment recording fee is enclosed. 
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[X] The Commissioner is hereby authorized to charge any additional 
fees which may be required or credit any overpayment to Deposit 
Account No. 14-1138. A duplicate copy of this form is 
attached. 

Power of Attorneys 

[X] The power of attorney in the prior application is to: George M. 
Yahwak, Registration No. 26,824 

a. [ ] The power appears in the original papers in the prior 

application. 

b. [ ] Since the power does not appear in the original papers, a 

copy of the power in the prior application is enclosed. 

c. [X] A Revocation and Power of Attorney power has been 

executed and is attached. 

d. [X] Address all future communications to (may only be 

completed by applicant, or attorney or agent of record) : 

Michael L. Goldman 

Nixon, Hargrave, Devans & Doyle llp 
Clinton Square 
P.O. Box 1051 

Rochester, New York 146 03 



10. Maintenance of Copendency of Prior Application 

[ ] A copy of the Request for Extension of Time filed in the 
pending prior application is attached. 



Date 



Respectfully submitted, 



7 k^jju^ tv{^ 




Karl a M. Weyand C 
Registration No. 40,223 



NIXON, HARGRAVE , DEVANS Sc DOYLE LLP 
Clinton Square, P.O. Box 1051 
Rochester, New York 146 03 
Telephone: (716) 263-1508 
Telecopy: (716) 263-1600 
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VEttttffeD STAffeMfeNT (bfcCLARAf ION) CLAIMING SMALL fcNTITY 
STATUS ffl tm i,9 Cf) fttld l.it(d))^NONI»ftOPlt ORGANIZATION 

I hereby declare that I am an official empowered to act 6ft behalf of the nonprofit organiza- 
tion Identified below: 

NAME OF ORGANISATION CORNELL RESEARCH FOUttDATIdft i INC. 

ADDRESS OP ORGANISATION 20 thornwddd DrUte, Stitfca log 



TYPE OF ORGANISATION 

* fj UNIVERSITY OR OTHER INSTITUTION OF HIGHER EDUCATION 

□ TAX EXEMPT UNDER INTERNAL REVENUE SERVICE CODE (26 USC 501 (a) 
and 501 (c)(3)) 

□ NONPROFIT SCIENTIFIC OR EDUCATIONAL UNDER STATUTE OF STATE 
OF THE UNltEb STATES OF AMERICA 

(NAME OF STATE ) 

(CITATION dF STATUTE ) 

□ WOULD OUALIFY AS TAX EXEMPT UNDER INTERNAL REVENUE SERVICE 
CODE (_6 USC SOI (d) ahd 501 (c)(3)) IF LOCATED IN THE UNITED STATES 
OF AMERICA 

□ WOULD QUALIFY AS NONPROFIT SCIENTIFIC OR EDUCATIONAL UNDER 
STATUTE OF STATE OF THE UNITED STATES OF AMERICA IF LOCATED IN 
THE UNITED .STATES OP AMERICA 

(NAM_ OP STATE • ) 

(CITATION OP STATUTE ) 

I hereby declare that the nonprofit organization identified above qualifies as a nonprofit or- 
ganization as defined In 3/ cPR 1.9(e) tor purposes of paying reduced flea under Section 
41(a) and (b) of Title 35, United States Code with regard to the invention entitled 
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described in 



g§ the specification filed herewith. 

□ application serial no. o / 

□ patent no. 



, issued 
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! hereby declare that rights UhdeT fcbrtfrad Of taW have biert cohveyed to and femairt With 
the nonprofit ergaftiiaBotl with mm te the above identified inventidn. 

If the ffchtS held fey foil rBflpfoflt ^§§HIM«dfi Sri Ml maom iHdydUal cdh^rri of 

efaanttatioh having rtfifife la the Invention is listed bete** afld hf Hahts to «)6 Invention are 
held by any p*m tit* than the mnm», «H6 eeuld ml auaiffy as. a small ibuswess jon- 
cern uhdef 37 CPB tJfol dr by arty eonceffi Which weutd n6tjjUa«fy as a smail business 
concern under 37 em liSid) dr a rtenpfbfit otgaftteattert tinder 37 c*R iete). 
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United States cede, and that such whim im statement! may leopardae the 4 veiidtty bt the 
application, any patent issuing therem or any patent te which this verified statement is di- 
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*Cornell ttesea^erl Foundation, Inc., is a Corporation which is 
wholly bwried by Cortteil University handling Patents and Licensing. 
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DNA POLYMERASE II! HOLOENZYME 



Abstract Of The Disclosure: 

The present invention Is directed toward the 5 previously r 
5 unknown genes, for ' subUhits 8, 8', X, e, and V, of the DNA polymerase HI 
holoenzyme, and toward. a Unique man-made enzyme containing 5, 
preferably 6, protein Slibunits which shows the same activity as the 
naturally occurring 10 protein subunit DNA polymerase III holoenzyme. 
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DNA POLYMERASE III HOLOENZYME 

Research support Which led to the making of the present 
invention wa£ provided in part by funding from the National Institutes 
5 of Health under Grant No. GM-38839. Accordingly, the federal 

government has certairi statutory rights to the invention described 
herein under 35 U.s'c! 200 et seq. 

The present application for Letters Patent is a Continuation-in- 
Nrti of my earlier United States Patent Application 07/826,926, filed 
10 January 24th 1992., said Continuation-in-Part having been filed as 
International Patent Application PCT US93/00627 on January 22nd 
1993. 

In 1982, Arthur Kornberg was the first to purify DNA polymerase 
III holoenzyme (holoenzyme) and determine that it was the principal 

i 5 polymerase that replicates the £ coli chromosome. 

In common with chromosomal replicases of phages T4 and T7, 
yeast, Drosophila, mammals and their viruses, the £ coli holoenzyme 
contains at least ten subunits in all ( a, e, 9, x, x» 8, v. 8 ) f see J. Biol 
Chem, 257:11468 (1982)]. It has been proposed that chromosomal 

20 replicases may contain a dimeric polymerase in order to replicate both 
the leading and lagging strands concurrently. Indeed the 1 MDa size of 
the holoenzyme and apparent equal stoichiometry of its subunits 
(except B which is twice the abundance of the others) is evidence that 
the holoenzyme has the following dimeric composition: 

25 (cte9)2t2(y55 , xv)2l34. 

One of the features of the holoenzyme which distinguish it as a 
chromosomal replicase is its use of ATP to form a tight, gel filterable, 
"initiation complex" on primed DNA. The holoenzyme initiation complex 
completely replicates A uniquely primed bacteriophage single-strand 

3 0 DNA (ssDNA) genome coated with the ssDNA binding protein (SSB), at a 
Speed of at least 500 nucleotides per second (at 30°C) without 
dissociating from an 8.6 kb circular DNA even orice. This remarkable 
pfocessivity (nucleotides polymerized in one template binding event) 
and catalytic speed is in keeping with the rate of replication fork 

3 5 movement in £ coli (1kb/second at 37°C). In comparison, DNA 
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polymerase I as Well as the T4 polymerase, Taq polymerase, and T7 
polymerase (sequence) are all very slow (10-20 nucleotides) and lack 
high processivity (10-12 nucleotides per binding event). With these 
distinctive features the poly I II holoenzyme has commercial application. 
5 However, there IS a good reason It has not yet bfeeri applied 

commercially. Namely, there are only a few (10-20) inolecules of 
polylll hotofenzyme per cell and thus it is difficult to purify; only a few 
tenths of a milligr&m ban be obtained from 1000 liters of ceils; and it 
can not be simply overproduced by genetic engineering because it is 
10 composed of 10 different subunits. 

The subunits of DNA polymerase III holoenzyme are set forth in 
the following table: 



Gene 


Suounit 


Mass (kda) 


Functions 




a 


, 130 


DNA polymerase 




e 


27 


Proofreading 3'-5' exonuciease 




9 


10 






t 


71 


Dimerizes core, DNA-dependent 
ATPase 




7 


47 


Binds ATP 


holA 


8 


35 


Interact with g to transfer 13 to 
DNA 


holB 


8' 


33 


DNA-dependent ATPase with g 


hole 


X 


15 




tiolD 


¥ 


12 




holE 


B 


40 


Sliding clamp on DNA, binds core 



As discovered in making the present invention, the 8' is a mixture 
of two proteins, both encoded by the same holB gene, and therefore it 
may be regarded as two subunits of the holoenzyme, thus bringing the 

3 0 total number of subunits in thd holoenzyme to eleven. 

The genes for 5 of the holoenzyme's subunits have been identified 
[see Nucleic Acids Research 14(20): 8091 (1986); Gene 28:159 (1984); 
PNAS (USA) 80:7137 (1982); J. of Bacteriology 169(12): 5735(1987); 
and J. of Bacteriology 158:455 (1984)]. These 5 genes have been cloned 

3 5 and overproducing expression piasmids for these 5 subunits are 

commercially available. However, prior to the present invention, the 



identification for the remaining 5 subunits which make up the 
holoenzyme was not known. 

The present inventioh describes, for the first time, the genetic 
and peptide sequences for the remaining five subunits of the 
polymerase III holoehzyme. In addition, to sequence these genes, very 
efficient operproducirig plasmids for each of them have been 
constructed, and purification protocols for each have been devised. 
Whereas the low amount of holoenzyme in cells has allowed the 
subassemblies to be available In microgram quantities prior to the 
present invention (milligrams of pure a, e, t, 5 and "IB subunits are 
available using molecular cloned genes in overproducing expression - 
plasmids), utilizing techhiques according to the present invention it hi 
been possible to obtain approximately 100 mg of homogeneous subunit 

from 4 L of cells. 

Prior to the identification of the remaining 5 genes of the 
holoenzyme, a teW micrograms of each subunit Was resolved from the 
holoenzyme. The sequence analysis of these resolved subunits 
eventually lead to the identification of their genetic sequences, and 
then to the genes per se. In addition, reconstitution studies were 
carried out to determine which subunits were essential to the speed 
and processivity of the, holoenzyme. In addition, overproducing 
expression plasmids for these 5 subunits were produced. 

Following these Studies, it has now been determined, according 
to the present invention, that at least 5 subunits are required for the 
action of this enzyme (a, e, B, 6, and 7), and preferably 6 subunits are 
essential for the Speed and processivity of the holoenzyme. These 
subunits, the combination of which are essential for the unique 
synthetip capabilities of the holoenzyme, according to the present 
invention, are: a, e, 6, 8, 8', and y. 

The 5 Subunits according to the present invention which have 
been identified, sequenced, cloned, provided in overproducing 
expression plasmids, expressed, and purified for the first time are 
subunits 8,8', X,e, and V. 
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The following figures, detailed description and examples are 
provided in order to allow the reader to obtain a fuller and more 
complete understanding of the present invention. With regard to the 
figures, 

5 Fig. 1 depicts the pET-8 expression vector according to the 

present invention! , , 

Fig. 2 depicts th6 pET-8' expression vector according to the 

present invention; 

Fig. 3A, B, and C depict the replication activity of 8, 8' and 88' 
1 0 with y and x according to the present invention; 

Fig. 4A and B depict the effect of 8' and 8 on the ATPase activity 
of y and x according to the present invention; 

Fig. 5 depicts the pET-9 expression plasmid according to the 
present invention; 

\ 5 pig. 6A and B depict the reconstitution assay according to the 

present invention indicating that 6 does not stimulate DNA synthesis; 

Fig. 7 depicts that 8, according to the present invention 
stimulates e in excision, of an incorrect 3' TG base pair; 

Fig. 8A and B depicts the native molecular weight of ae and polill 
20 core according to the present invention; 

Fig. 9 depicts the construction of the pET-y overproducing 
plasmid according to the present invention; 

Fig. 10A and B depict the stimulation of the DNA dependent 
ATPase of y and x by y and %, according to the present invention; 
25 Fig. 11 depicts the construction of the pET-x expression plasmid 

according to the present invention; and 

Fig. 12A and B depicts native molecular mass of x. V and the xv 
complex, according to the present invention. 

More specifically with regard to figure 1, there is shown the 
3 0 expression vector for 8 as prepared and described in the following 

examples. The holA gene excised from Ml3-8-Ndel using Ndel is shown 
above the pET3c vector. The open reading frame encoding 8 is inserted 
into the Ndel site of pET3c such that the initiating ATG is positioned 
downstream of the Shine-Dalgarno sequence and a T7 promoter. 
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Downstream of the holA Insert are 492 nucleotides of E. coli DNA and 
591 nucleotides of Ml3mp18 DNA. The 17 RNA polymerase termination 
sequence is downstream of the holA insert. 

More specifically With regard to figure 2, the holB fragment 
5 excised from M13-8'-Ndel using Nde! is shown above the expression 
vector. The open reading frame encoding 8' is inserted into the Ndel site 
of pET3c such that th.e initiating ATG is positioned downstream of the 
Shirie-Dalgarno sequence' and a T7 promoter. The holB insert also 
contains 158 nucleotides of E. coli DMA downstream of the the holB stop 
1 0 codon to an Ndel site, the T7 polymerase termination sequence is 
downstream of the holB insert. 

With regard to figure 3, replication assays were performed as 
described below. Figure 3C summaries the replication assays. Either y 
or t was titrated into assays containing SSB "coated" primed Ml3mp18 
1 5 ssDNA, J, ae and either 2 hg 8, 2 ng 8' or an equal mixture (1 ng each) of 8 
and 8' (88'). The reaction mixture was preincubated for 8 minutes to 
allow reconstitution of the processive polymerase prior to initiating a 
20 second pulse of DNA synthesis. Figure 3A depicts the results of the 
Y subunit being titrated into the replication mixture either alone (open 
20 squares) or containing; either 8' (closed circles), 8 (open circles), or 88' 
(closed squares). Figure 3B depicts the results of the y subunit being 
titrated into the replication mixture either alone (open triangles), or 
containing either 8* (closed circles), 8 (open circles), or 88' (closed 
squares). 

25 With regard to figure 4, ATPase assays were performed in the 

presence of Ml3mp18 SSDNA as described in detail below. The subunits 
in each assay are identified below the plot in the figure. Figure 4A 
refers to the effect of 8, 8' and 0 on the yATPase; figure 4A refers to 
the effect of 8, 8' and 8 on the tATPase. 

3 0 With regard to figure 5, the shaded Ndel-BamHI segment includes 

the holE gene (arrow). Transcription of the holE is driven by a T7 
profnoter. the T7 RNA polymerase termination sequence is downstream 
from the E. coli DNA insert. Translation of 8 is aided by an upstream 
Shihe-Dalgarno sequence. 
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With regard to .figure 6, the replication reactions were performed 
as described below, figure 6A outlines the protocol summarizing the 
assay. Either the ae complex or reconstituted polltl core (ae8) were 
titrated into the assay Which contains 8, y complex and primed phage X 
174 ssDNA "coated" with SSB. Proteins and DNA Were preincubated for 
6 minutes to allow Jlnje for assembly of the processive polymerase. A 
15 secohd round df synthesis was initiated Upon addition of remaining 
deoxynucleoslde triphosphates. Circles: titration of ae complex; 
triangles: titration of <xe8. The a subunit was limiting in these assays 
and therefore the amount of ae and ae8 added to the assay is taken as 
the amount of a added. 

With regard to figure 7, there is depicted the results of a 
titration of 8 Into the assay containing e and a mismatched 3' 32P-end- 
labelled T residue on a synthetic "hooked" primer template. 

With regard to figure 8, the is shown a comparison of the 
migration of txe and pollll core relative to protein standards on gel 
filtration and in glycerol gradients. The position of pollH core 
reconstituted using either excess or substoichiortietrlc 8 Was the same 
in both types of analysis. Figure 8A depicts gel filtration analysis on 
Superose 12. The Stokes radius of protein standards was calculated 
from their known diffusion coefficients. Figure 8B depicts glycerol 
gradient sedimentation analysis. Sedimentation coefficients of the 
standards are Amy, sweet potato B-amylase (152 kDa, 8.9S); Apf, horse 
apoferritin (467 kDa, 59.5A); IgG; bovine immunoglobulin G (158 kDa, 
52.3A, 7.4S); kSA, bovine serum albumin (67 kDa, 34.9A, 4.41 S); Ova, 
chicken ovalbumin (43.5 kDa, 27.5A, 3.6S); Myo, horse myoglobin (17.5 
kDa, 19.0A, 2.0S). The positions of ae and pollll core relative to the 
protein standards are indicated In the plots. The Stokes radii and S 
values of ae and pollll were measured by comparison to the standards. 

With regard to figure 9, the HolD gene was amplified from 
genomic DNA using primers which form an Ndel Site at the initiating 
ATG and downstream BamHI site. Due to an internal Ndel sitej/vithin 
holD, insertion of the complete holD gene into the pET3c expression 
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plasmid required the two steps shown below. Additional details appear 
in the following description. 

With regard to figure 10, ATPase assays were performed using a 
two-fold molar excess of % and (as monbrtiers) over y and i (as dimers) 
5 arid Using Ml3mpl8 ssDNA m an effector, F\gm loA depicts ATPase 
assays of \jr> x» Y arid cofoblriation of these proteins; figure 10B depicts 
the effect of v and % subunits on the ATPase of x. Subunits in the 
a&says are Indicated below tha plots, and assays performed in the 
presence of SSB are indicated. 
iO With regard to figure 11, the holC gene was amplified from 

genomic DNA using primers which generate an Ndel site at the start - 
codon of holC and a BamHI site 152 nucleotides downstream of hofC as 
described below. The 604 bp amplified product Was purified, digested 
With Ndel and BamHI, and ligated into the Ndel and BamHI sites of pET3c 
15 to yield pET~x- The open reading frame encoding % Was inserted into the 
Ndel-BamHI sites of pET3c such that the initiating ATG is positioned 
downstream of the Shine-Dalgarno sequence and a T7 promoter. The T7 
ftNA polymerase termination sequence is downstream of the holC insert. 
The Amp r indicates the amplcillin resistance gene; the ori indicates the 
20 p&322 origin of replication. 

With regard to figure 12 A, the Stokes radius of % y \jr and xv 
complex was determined by comparison with protein standards in gel 
filtration on Superdex 75. With regard to figure 12B, the S value of x* V 
and xy complex determined by comparison to protein standards in 
25 glycerol gradient analysis are given. The protein standards were: 
boviftS serum albUHiln (6SA), 34.9A, 4.41 S; chicken ovalbumin (Ova) 
27.5A, 3.6S; soybean trypsin inhibitor (Ti), 23.8A; bovine carbonic 
ahhydrase II (Carb), 3.06S; horse myobgtobin (Myo), 19.0A, 2.0S; and 
horse kidney metalothlonln (Met), 1.75S. 
3 0 In general, the sequence for each of the genes for the five 

subunit peptides, according to the present invention, began with 
isolating, purifying and sequencing the individual peptides. 

The 5, 8', X, V subunits were purified by a combination of two 
published procedures. First the y complex ( y, 8, 8\ was purified from 




1.5 Kg E colt HB101 ,(j)NT203-pSK100) as described by Maki [see J. Bio. 
Chem 263:6555(1 988)]* Second, the complex was split into two 
fractions - "am?* complex and a "58"" complex - as described by 
O'Donnell [see J. Bio. Chem 265:1179 (1990)]. Th§ peptide sequences for 
5 5 and 5' were obtained from the 88' fraction, whereas the peptide 

sequences of x and ar§ obtained from the wx. fraction. The 6 subunit 
sequence was obtained from a side fraction off this procedure which 
contained nearly pure polymerase III ( a, e, 9 ) subunits. 

For all 5 proteins, the amino acid sequences were obtained in the 
1 0 same manner, by the Use of N-termihal analysis and tryptic analysis. 
N-terminal analysis was conducted using known techniques of SDS-PAG 
electrophoresis [see Nature 227:680(1970)] in a 15% gel, and 
subsequent electroelution onto PVDF membrane. The resolved peptides 
were removed from the membrane and sequenced. For tryptic analysis, 
1 5 either 88' or -yxv corriples Was chromatographed in a 15% SDS-PAG gel to 
separate the individual subunits. However, for this procedure, the 100 
pmol was electroblotted onto nitrocellulose. The nitrocellulose 
membrane was then digested with trypsin, and the peptides resolved by 
rtiicrobore HPLC. The resolved peptides were then sequenced. 
20 The electroblotting procedure used in the tryptic analysis 

protocol is more fully described in the following general example: 

EXAMPLE I 
(electroblotting) 

SDS (Bio-Rad) was purified by crystallization from ethanol- 
25 water, SDS (100 g) was added to ethanol (450 g), stirred, and heated to 
55°C. Additional hot Water was added (50-75 ml) Until all of the SDS 
dissolved. Activated charcoal (10 g) was added to the solution, and 
after 10 minutes the slurry was filtered through a Buchner funnel 
(Whatman No. 5 paper) to remove all the charcoal. The filtered solution 
3 0 was chilled, first a 4°C for 24 hrs and then at -20°C for an additional 
24 hrs. Crystalline SDS was collected on a coarse-frit sintered-glass 
funnel and washed with 800 ml of ethanol chilled to -20°C. The partial 
purified SDS Was then recrystallized using the above procedure but 
Without the charcoal. 0.75 mm SDS-Laemmli gel was made using ultra- 



pure reagents. Prior to electrophoresis 10 mM Glutatnhione (to a final 
concentration of 0.05 mM) was added to the upper chamber buffer, and 
the system preelectrophotesed 2 hr at 3-5 mA (3 mA for mini-gel, 5 mA 
for normal). After 2 hrs, the upper chamber was emptied and standard 
5 tris-glycine buffer Was added. The samples to be run were denatured 
using Laemmti dertaturatoh solution made from the ultr^ptife reagents 
{in the presence of 5 mM DTT). The gel was run under conditions such 
that Reparation was acheived in less than 2 hrs. After the gel run, the 
gel was soaked for 30 min in 10 mM CAPS pH 11, 5% methanol (% of 

1 0 methanol will vary depending on the size of the protein: in general, high 
molecular weight proteins transfer more efficiently in absence of 
methanol while low molecular weight proteins require methanol in the 
buffer). CAPS buffer was made by titrating a 10 mM solution with 10 N 
NaOH. For gel transfer, slices of Immobilon were wet in 100% methanol 

1 5 and equilibrated 10 miri in the CAPS transfer buffer, and the protein 
transferred using Bio-Rad mini blotter (transfer time will vary 
depending on protein size, methanol, etc.; -70 kDa polypeptide will 
transfer in 90 min in the presence of 5% methanol at 15V). After 
transfer, Immobilort Was feoaked in distilled water for 5 min, and the 

20 membrane was stained With 0.1% Commassie Blue R250 in 50% 

methanol for 1 min, and destained in 50% methanol and 10% aldehyde- 
free acetic acid. The membranes were soaked in distilled water for 10 
min, and allowed to air dry. Protein bands of interest were cut from 
the membrane and stored in eppendorf tubes at -20°C until sequenced. 

25 The identification of the subunit gene of DNA polymerase HI 8 

Was accomplished by purifying the 8 8' proteins to apparent homogeneity 
through an ATP-agarose column from 1.3 kg of the 8/x overproducing 
strain of £. coti [HB 101 (pNT 203, pSK 100)]. 

The 8 8' subuhits. were separated by electrophoresis in a 15% SDS- 

3 0 PAG (polyacfylamide gel), then electroblotted onto PVDF membrane 

(Whatman) for N-terminal sequencing (50 pmol each) [see J. Biol. Chem. 
262:10035 (1987)) , and onto nitrocellulose membrane (Schleicher and 
Schuell) for tryptic analysis (140 pmol each) [see PNAs USA 84:6970 
(1987)]. Proteins were visualized by Ponceau S (Sigma). Protein 
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sequences were determined using an Applied Biosystems 470A gas- 
phase microsequenator. Sequence rusults were as follows: 

N-terminal sequence: 
NH 2 -Met Leu Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn 
5 5 10 

Glu Gly Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro; 
15 , , 20 25 

tryptic peptide 5-1: 
NH 2 -Ala Ala Tyr Leu 'Leu Leu Gly Asn Asp Pro Leu leu Leu Gin 
10 5 10 

Glu Ser Gin Asp Ala Val Arg; 
15 20 _ 

tryptic peptide 8-2: 
NH2-Ala Gin Glu Asn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg 
15 5 10 

tryptic peptide 8-3: 
NH2-Val Glu Gin Ala Val Asn Asp Ala Ala His Phe Thr Pro Phe 

5 10 
His Trp Val Asp Ala Leu Leu Met (Gly) (Lys) . 
20 15 20 

Paranthesis in' the above sequence indicate uncertainty in 

the amino acid assignment. 

The DNA sequencing, construction of the overproducing vector, 
and DNA replication assays for this subunit were conducted according 
25 to the following example: 

EXAMPLE II 

DNA sequencing: 

The 3.2 kb Kpnl/Bg1ll (restriction enzymes, New England Biolabs) 
fragment containing 5 was excised from M69 (Kohara) and directionally 

3 0 ligated into pUCl8 to yield pUCdelta. Both strands of DNA were 

sequenced by the chain termination method of Sanger using the United 
States Biochemicals sequenase kit, [a-35s]dCTP (New England Nuclear), 
and synthetic DNA 17-rriers (Oligos etc. Inc.). All sequence information 
presented here was determined on both strands using both dGTP and 

3 5 dITP in sequencing reactions. 

Construction of the overproducing vector: 
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Approximately 1-.7 kb of DNA upstream of 8 was excised from 
pUCdelta using Kpnl (polylinker site) and BstXI (the* BstXI site is 13 
base pairs upstream of the start codon of holA) followed by self- 
ligation of the plasmid. A 1.5 kb fragment containing ihe holA gene was 
then excised Using EcoRI and Xbal (these sites are in the pUC polyiinker 
on either side of the 5 insert) followed by directional ligation Into 
M13mp18 to yield Ml3delta. An Ndel site was generated at the start 
codon of hoik by primer directed mutagenesis (see Methods Enzymol ' 
154:367 (1987)] using a DNA 33-mer (5'->3'): 

GTACAACCGA- ATTATATGTT ACCCAGCGAG CTC 33 
containing the Ndel site (underlined) at the start codon of hoi A to 
prime replication of M13delta viral ssDNA, and using DNA polymerase 
and SSB in place of Klenow polymerase to completely replicate the 
circle without strand displacement [see J. Biol. Chem. 260:12884 
(1985)]. The Ndel site was verified by DNA sequencing. Ah Ndel 
fragment (2.1 kb) containing the 8 gene was excised from the Ndel 
mutated M13 delta and ligated Into pET-3c linearized using Ndel to 
yield pETdelta. The orientation of the holA gene in pETdelta was 
determined by sequencing. 
0 DNA replication assays: 

The replication assay contained 72 ng M13mp18 ssDNA (0.03 pmol 
as circles) uniquely primed with a DNA 30-mer [see J. Biol. Chem. 
266:11328 (1991)], 980 ng SSB (13.6 pmol as tetramer), 22 ng 6(0.29 
pmol as dimer), 200 ng y(2.1 pmol as tetramer), ,55 ng a e complex in 
5 a final volume (after addition of proteins) of 25 ul 20 mM Tris-HCL 
(pH7.5), 8 mM MgCl2, 5 mM DTT, 4% glycerol. 40 ug/ml BSA, 0.5 mM 
ATP, 60 nM eachdCTP, dGTP, dATP and 20 u.M [a-32p]dTTP (New 
England Nuclear). Proteins used in the ^constitution assay were 
purified to apparent homgeniety and their concentration determined. 
0 Delta protein or column fraction containing 8, was diluted in buffer (20 
mM Tris-HCL (pH7.5). 2 mM DTT, 0.5 mM EDTA, 20% glycerol, 60 mM NaCI 
and 50 tig/mi BSA) such that 1-10 ng of protein was added to the assay 
on ice, shifted to 37°C for 5 minutes, then quenched Upon spotting 
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directly onto DE81 filter paper. DNA synthesis was quantitated as 

described. 

Gel filtration: 

Gel filtration of 8, 8' and the 88' complex was performed using an 
5 HR 10/30 Superdex 75 column equilibrated in 20mM Tris-HCL (pH 7.5), 
10% glycerol* 2 mM pTT, 0.5 mM EDTA and 100 mM (buffer B). Either 6 
(30 |xg, 0.78 nmol as monomer), 8' (30 0.81 nrrtol as monomer) or a 
mixture of 8 and 8* (30 u.g of each) were incubated for 30 minutes at 
15°C in 100 U.I of buffer B then the entire 100 uJ sample was injected 
1 0 onto the column. The column was developed with buffer B at a flow 

rate of 0.3 ml/mihute and after the first 6 ml, fractions of 170 ul were 
collected. Fractions were analyzed by 13% SDS polyacrylamide gels 
(100 ul per lane) stained With Coomassie Briliant Blue. Densitometry 
of stained gels Was performed using a Pharmacia-LKB Ultrascan XL 

15 laser densitometer. 

Gel filtration of y of y mixed with either 8 or 8' or both 8 and 8' 
was performed usihg an HR 10/30 Superose 12 column equilabrated in 
buffer B. Protein mixtures were preincubated 30 minutes at 15°C in 
100ul buffer B then injected onto the column and the column was 
20 developed and analyzed as described above. Replication activity assays 
of these column fractions were performed as described above with the 
following modifications. The y subunit was omitted from the assay and 
each fraction was diluted 50-fold with 20 mM Tris-HCL (pH 7.5), 10% 
glycerol, 2 mM DTT ( 0.5 mM EDTA and 50 |ig/ml BSA. Then 2 \i\ of 
25 diluted fraction was withdrawn and added to ihe assay. 

Protein standards were a mixture of proteins obtained from 
BidRad and from Sigma Chemical Co. and were brought to a 
concentration of approximately 50 ug each in 100uJ buffer B before 
analysis on either Superdex 75 or Superose 12 columns. 
3 0 Glycerol gradient sedimentation: 

Sedimentation analysis of 8, 8* and a mixture of 8 and 8' Were 
performed using 11.6 ml l0%-30% glycerol gradients in buffer B. Either 
8 (57 ug, 1.5 nmol as monomer), 8* (56 ug, 1.5 nmol as monomer) or a 
mixture of 8 and 8' (57 ug and 56 ug, respectively) were incubated at 



15°C for 30 minutes in a final volume of 100 \i\ buffer B then each 
sample was layered onto a separate gradient Protein standards (50 jig 
each in 100 |x! buffer B) were layered onto another gradient and the 
gradients were centrifuged at 270,000 X g for 60 hours at 4°C. 
5 fractions of 170 \i\ were collected from the bottom of the tube and 
analyzed (106 jil/l&tje) in a 13% SDS polyacrylamlde gel stained with 
Godfrtassie Slue. 
Light scattering: 

The diffusion coefficient of 8, 8' and the 88* cotaplex was 

10 determined by dynamic light scattering at 780nm in a fixed angle (90°) 
Bidt&ge model dp-801 light scattering instrument (Oros instruments). 
Samples of 8 (200 |ig, 5,2 nmol as monomer), and 8' (200 \ig 9 5.4 nmol as 
monomer) were In 400 of 20 mM Tris-HCL (pH 7.5), 100 ntM NaCI and 
1.2% glycerol The mixture of 8 and 8' (100 jxg of each) was in 400 fil of 

15 20 mM TriS-Hei (pH 7 A) and 100 mM NaCL The observed diffusion 

coefficient of 8* in the presence of 1.2% glycerol Was 0.6% higher than 
irt th§ absence of glycerol. Hence, the 1.2% glycerol in the 8 and 8' 
MmfMes had little effect on the observed diffusion coefficient. 

The purification of 8 was preformed according to the following 

20 example: 

EXAMPLE III 
(purification of 8) 

BL21 (DE3) cells harboring pETdelta were grown at 37°C in 12 

liters of LB media cohtaining 100 fig/ml of ampiciilin. Upon growth to 

1% Ob 1.5, the temperature was lowered to 25°C, and IPTG was added to 0.4 
mM. After a further 3 hrs. of growth, the cells (50 g) were collected by 
centrifugation. Cells were lysed using lysozyme as described in prior 
publications* and the debris removed by centrifugation. The following 
purification steps Were performed at 4°C. The assay for 8 is as 

3 0 described above. 

The clarified cell lysate (300 ml) was diluted 2-fold with 20 mM 
Tris-HCI (pH 7.5), 20% glycerol, 0.5 mM EDTA, 2 mM DTT (buffer A) to a 
Conductivity equal to 112 mM Nad, and then loaded (over 3 hrs.) onto a 
60 ml Hexylamine Sepharose column equilibrated With buffer A plus 0,1 
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M NaCI. The Hexylanjine column was washed with 60 rnl buffer A plus 
0.1 M NaCI, and then eluted (over 14 hrs) using a 600 ml linear gradient 
of 0.1 M NaCI to 0.5 M NaCI in buffer A. Eighty fractions were collected. 
Fractions 16-34 (125 mis) were dialyzed against 2 liters of buffer A 
5 plus 90 mM NaCI overnight, and then diluted 2-fold with buffer A to 
yteld a conductivity equal to 65 mM NaCI just prior to loading (over 45 
frtin) onto a 60 ml colufnn of Heparin Sepharose equilibrated in buffer A 
plus 50 mM NaCI. Th6 heparin column Was washed With 120 ml buffer A 
plus 50 mM NaCI, and then eluted (over 14 hrs) using a 600 ml linear 

1 0 gradient of 0.05 M NaCI to 0.5 M NaCI in buffer A. Eighty fractions were 
collected. Fractions 24-34 were pooled and diluted 3-fold (final 
volume of 250 mis) With buffer A to a conductivity equal to 85 mM NaCI 
just prior to loading (over 50 min) onto a 50 ml Hi-Load 26/10 Q 
Sepharose fast flow FPLC column. The column was washed with 150 ml 

1 5 buffer A plus 50 tnM NaCI, and then eluted using a 600 ml linear 

gradient of 0.05 M NaCI to 0.5 mM NaCI in buffer A. Eighty fractions 
were collected. Fractions 28-36 Which contained pure 5 were pooled 
(74 mls f 1.9 mg/ml)i passed over a 1 ml ATP-agarosd column (N-6 
linked) to remove any possible y complex contaminant, and then dialyzed 

20 versus two changes of 2 liters each of buffer A containing 0.1 M NaCI 
(the DTT was ommitted for the purpose of determining protein 
concentration spectrophotometrically) before storing at -70°C. 

The following table gives the results obtained from measuring 
the protein levels obtained from the fractions taken in Example III. 



25 Fractions total total specific fold % 

protein units'! activity purifica- yield 

(mg) (units/mg) tion 

I Lysate 2 2070 5.4x1 0 9 2.6x1 0 6 1.0 100 

3 0 li Hexylamine 446 2.5xl0 9 5.6x10^ 2.2 46 

III Heparin 197 2.0xi0 9 10.2x10$ 3.9 37 

IV Q Sepharose 141 1.5xl0 9 1 0.6x1 0 6 4.1 28 



^Ond unit is defined as pmol nucleotide incorporated per minute 
2 Ommission of gamma from the assay of the lysate resulted in a 
3 5 200-fold reduction of specific activity (units/mg) 



The 8 gene was identified using amino acid sequence information 
from 5. The sequence of the N-terminal 28 amino acids of 8 and the 
sequence of three internal tryptic peptides were determined. One of the 
tryptic peptides (tryptic peptide 5-1) overlapped 10 amino acids of the 
5 N-terminal sequence. A search of the GenBank revealed a sequence 
wtiich predicted the ,exact amino acid sequence of the 21 amino acid 
tryptic peptide 8-1 whiph overlapped the N-terminal sequence. The 
matching sequence occurred just downstream of the rlpB gene at 15.2 
minutes of the E. coli chromosome. The match of the published DNA 
i 0 sequence to the N-terrtiinal sequence of 8 was imperfect due to a few 
errors in the published sequence of this region. The published sequence 
of rlpB accounted for approximately 22% of the 8 gene and did not 
encode either of the other two tryptic fragments. The Kohara lamda 
phage 169 contains 19 kb of DNA surrounding the 8 gene. The 3.2 kb 
15 Kphl/Bg1tl fragment containing 8 was excised from Xi 69, cloned into 
pUC1 8 and the 8 gene was sequenced. The DNA sequence predicts the 
correct N-terminal sequence of 8 (except the He instead of Leu at 
position 2) and encodes the other two internal tryptic peptides of 8 in 
the same reading frame, and predicts a 343 amino acid protein of 38.7 
20 kda consistent with the mobility of the 8 in SDS-PAGE (35 kDa). 

The full nucleic acid sequence for the 8 gene according to the 
present invention was determined to be: 
ATS ATT CGG TTG TAC CCG GAA CM CTC CGC GCG CAG CTC 
AAT GAA GGG CTG CGC GCG GCG TAT CTT TTA CTT GGT AAC 
25 GAT CCT CTG TTA TTG CAG GAA AGC CAG GAC GCT GTT CGT 
CAG GTA GCT GCG GCA CAA GGA TTC GAA GAA CAC CAC ACT 
TTT TCC ATT GAT CCC AAC ACT GAC TGG AAT GCG ATC TTT 
TCG TTA TGC CAG GCT ATG ACT CTG TTT GCC AGT CGA CAA 
ACG CTA TTG CTG TTG TTA CCA GAA AAC GGA CCG AAT GCG 
30 GCG ATC AAT GAG CAA CTT CTC ACA CTC ACC GGA CTT CTG 
CAT GAC GAC CTG CTG TTG ATC GTC CGC GGT AAT AAA TTA 
AGC AAA GCG CAA GAA AAT GCC GCC TGG TTT ACT GCG CTT 
GCG AAT CGC AGC GTG CAG GTG ACC TGT CAG ACA CCG GAG 
CAG GCT CAG CTT CCC CGC TGG GTT GCT GCG CGC GCA AAA 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
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CAG CTC AAC TTA GAA. CTG GAT GAC GCG GCA AAT CAG GTG 507 
CTC TGC TAC TGT TAT GAA GGT AAC CTG CTG GGG CTG GCT 546 
CAG GCA CTG GAG CGT TTA TCG CTG CTC TGG CCA GAC GGC 585 
TTC ACA T™ rrr. rnr ^ AftA CAG GCG GTG AAT GAT 624 
5 arc am cat t t t arc fYrrrrr CAT TGG GTT GAT GCT TTG 663 
TO ATG GGA AAA AGT.AAG CGC GCA TTG CAT ATT CTT CAG 702 
CAA CTG CGT CTG GAA ,GGC AGC GAA CCG GTT ATT TTG TTG 741 
CGC ACA TTA CAA CGT 'GAA CTG TTG TTA CTG GTT AAC CTG 780 
AAA CGC CAG TCT GCC CAT ACG CCA CTG CGT GCG TTG TTT 819 
10 GAT AAG CAT CGG GTA TGG CAG AAC CGC CGG GGC ATG ATG 858 
GGC GAG GCG TTA AAT CGC TTA AGT CAG ACG CAG TTA CGT 897 
CAG GCC GTG CAA CTC CTG ACA CGA ACG GAA CTC ACC CTC 936 
AAA CAA GAT TAC GGT CAG TCA GTG TGG GCA GAG CTG GAA 975 
GGG TTA TCT CTT CTG TTG TGC CAT AAA CCC CTG GCG GAC 1014 
15 GTA TTT ATC GAC GGT TGA 1032 

The underlined portions of this sequence refer to subunits which 
are 8 -1 (55-117), 8 -2 (358-399), and 8 -3 (604-672). In addition, the 
upstream sequence: 

COGAACAGCT GATTCGTAAG CTGCCAAGCA TCCGTGCTGC QSATATJCGT 50 
20 TCCGACGAAG AACAGACGTC GACCACAACG GATACTCCGG CAACGCCTGC 100 
ACGCGTCTCC ACCAC£CTGG GTAACTS 127 

wherein the last underlined TG denotes two-thirds of rlpB stop codon; 
in addition, the positive RNA polymerase promoter signals (TCGCCA and 
GATATT) and the Shine-Dalgarno sequence (ACGCT) are underlined. 
25 In addition, the downstream nucleic acid sequence for holA 

begins with a stop codon: 

,TGA ATGAAATCT TTACAGGCTC TGTTTGGCGG CACCTTTGAT CCGGTGCACT 53 
ATGGTCATCT AAAACCCGTT GGAAGCGTGG CCGAAGTTTT GATTGGTCTG AC 105 
The holA gene translatesinto the amino acid sequence: 
3 0 Met He Arq Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn Glu 

5 10 15 

Glv Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu 
20 25 -30 

Leu Gin Glu Ser Gin Asp Ala Val Arg Gin Val Ala Ala Ala Gin 
35 35 40 45 



Gly Phe Glu Glu His- His Thr Phe Ser He Asp Pro Asn Thr Asp 
50 55 60 

Trp Asn Ala He Phe Ser Leu Cys Gin Ala Mat Ser Leu Phe Ala 
65 70 75 

Ser Arq Gin Thr Leu Leu Leu Leu Leu Pro Glu Asn Gly Pro Asn 
80 85 90 

Ala Ala He Asn Glu Gin Leu Leu Thr Leu Thr Gly Leu Leu His 
95" 100 105 

Asp Asp Leu Leu Leu 'He Val Arg Gly Asn Lys Leu Ser Lys Ala 
HO' 115 120 

Gin Glu Asn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg Ser Val 
125 130 135 

Gin Val Thr Cys Gin Thr Pro Glu Gin Ala Gin Leu Pro Arg Trp 
140 145 150 

Val Ala Ala Arg Ala Lys Gin Leu Asn Leu Glu Leu Asp Asp Ala 
155 160 165 

Ala Asn Gin Val Leu Cys Tyr Cys Tyr Glu Gly Asn Leu Leu Asn 
170 , 175 180 

Leu Ala Gin Ala Leu Glu Arg Leu Ser Leu Leu Trp Pro Asp Gly 
185 190 195 

Lvs Leu Thr Leu Pro Arg Val Glu Gin Ala Val Asn Asp Ala Ala 
200 205 210 

His Phe Thr Pro Phe His Trp Val Asp Ala Leu Leu Met Gly Lys 
215 220 225 

Ser Lys Arg Ala Leu His He Leu Gin Gin Leu Arg Leu Gly Gly 
230 235 240 

Ser Glu Pro Val He Leu Leu Arg Thr Leu Gin Arg Glu Leu Leu 
245 250 255 

Leu Leu Val Asn Leu Lys Arg Gin Ser Ala His Thr Pro Leu Arg 
260 265 270 

Ala Leu Phe Asp Lys His Arg Val Trp Gin Asn Arg Arg Gly Met 
275 280 285 

Met Gly Glu Ala Leu Asn Arg Leu Ser Gin Thr Gin Leu Arg Gin 
290 295 300 

Ala Val Gin Leu Leu Thr Arg Thr Glu Leu Thr Leu Lys Gin Asp 
305 ~ 310 - 315 

Tvr Gly Gin Ser Val Trp Ala Glu Leu Glu Gly Leu Ser Leu Leu 
320 325 330 

Leu Cys His Lys Pro Leu Ala Asp Val Phe He Asp Gly 
335 340 343 

The holA gene is located in an area of the chromosome containing 
several membrane protein genes. They are all transcribed in the same 
direction. The mrdA and mrdB genes encode proteins responsible for the 




rod shape of E. coli and the rlpA and r1pB genes encode rare 
lipoproteins which are speculated to be important to ceil duplication. 
The position of the 8 gene within a cluster of membrane proteins may be 
coincidental or may be related to the putative attachment of the 
5 replisome to the membrane. 

As noted, the.'t^minaiion codon of the r1pB protein overlaps one 
nucleotide With the' Inflating ATG of holA leavin g a gap of only 2 
nucleotides between* these genes. holA may be ari operon with HpB or 
there may be a promoter within r1pB. The nearest possible initiation 
1 0 signals for transcription (the putative RNA polymerase signals) and 

translation (Shine-Dalgarno) are underlined in the sequence given above; 
the match td their respective consensus sequences is not strong 
suggesting a low utilization efficiency. Inefficient transcription 
and/or translation may be expected for a gene encoding a subunit of a 
1 5 holoenzyme present at only 10-20 copies/cell. The 8 gene uses several 
rare codons fCCC(Pro), ACA(Thr), GGA(Gly), AGT(Ser), AAT(Asn), TTA, 
TTG, CTC(all Leu)] 2-5 times more frequently than average which may 
decrease translation efficiency. ATP binding to 8 Within the holenzyme 
has been detected previously by UV crosslinking. The DNA sequence of 
20 the 8 shows a near match to the ATP binding site consensus motif (i.e. 
AX3GKS for 8 at residues 219-225 compared to the published consensus 
G/AX4GKS/T, G/AXGKS/T or G/AX2GXGKS/T [see Nuc. Acids Res 17:8413 
(1989)]. Whether 8 binds With ATP specifically at this site remains to 
be determined. Of the 33 arginine and lysine residues in 8. 16 of them 
25 (50%) are within amino acids 225-307. This same region contains only 
5 (14%) of the 35 glutamic and aspartic acid residues. Whether this 
concentration of basic residues is significant to function is unknown. 
There are no strong matches to consensus sequences to motifs encoding: 
zinc fingers or helix-loop-helix DNA binding domains. 
30 The holA gene was cloned into Ml3mp18 and an Ndel site was 

created at the initiating methionine by the site directed mutagenesis 
technique in order to study the overproduction of this gene. The 8 gene 
Was then excised from Ml3delta and inserted into the Ndel site of the 
pET-3c expression vector tsee Methods Enzymol 185:60 (1990)] which 



places Sunder control. of a strong T7 RNA polymerase promotor , see Fig 
3-1. Upon transformation into BL21(DE3) cells and induction of T7 RNA 
polymerase with IPTG, the 8 protein was expressed to 27% total cell 
protein. For reasons unknowh, 5 was not produced in BL21(DE3) ceils 
5 containing the pLysS ptasmid. Induction at 25°C yielded approximately 
2-fold more 8 and increased the solubility of the overproduced 8 
relative to induction at 37°C. Twelve liters of induced cells were 
lysed using lysozyme and 141 mg of pure 8 was obtained in 28% overall 
yield upon column fractionation using Hexylamine Sepharose. Heparin 

1 0 Agarose, and Q Sephatofee. Delta protein tended to precipitate upon 
standing in low salt (<70 mM), especially during dialysis. Therefore, 
low salt was avoided except for short periods of time and column 
fractions containing 5 Were sometimes diluted in preparation for the 
next column rather than dialyzed overnight. The 8 subunit was assayed 

15 by its ability to reconstitute efficient replication of a singly primed 
M13rtip18 ssDNA "coated" with SSB in the presence of a, e, p, and y 
subUnits. Cell lysate prepared from induced cells containing pETdelta 
were more active In the replication assay than cell lysate prepared 
from induced cells containing the pET-3c vector, 

20 The expressed 8 protein comigrated with the authentic 8 subunit 

contained within the y complex of the holoenzyme. The N-terminai 
sequence analysis of the pure cloned 8 was identical to that predicted 
from the hoi A sequence according to the present invention provided that 
the protein encoded by the gene had been purified. Furthermore, the 

25 overproduced 8 subunit was active with only the a, e, y and 6 subunits 
Of the holenzyme (fig 5-1). In the presence of a sixth subunit, 8', 
activity was enhanced. The amount of the cloned 8 required to 
reconstitute the efficient DNA synthesis characteristic of the 
holoenzyme Using the 5 or 6 subunit combination according to the 

3 0 present invention is in the range shown previously for the naturally 
purified 8 resolved from the y complex. As shoWri below, addition of 
more y to the replication assay brings the amount of 8 down even further 
to about 1ng for a stoichiometry of about 1-2 8 monomers per DNA 
circle replicated. , 



Electrospray mass spectrometry of the cloned 8 protein yielded a 
molecular mass of 38,704 da. This mass is within 0.0015 of the mass 
predicted from the gene; well within the 0.01% error of the mass 
spectrometry techhique. This is evidence that the DNA sequence above 
5 according to the present invention contains no errors and indicates the 
overproduced 8 is no;t modified during or after translation. , Tha e280 
calculated from the arrttho acid composition of 8 Is 46,230 M^cm* 1 . 
The measured absGrbahfce of 8 in 6M guanidine hydrochloride is only 0.2% 
higher than in buffer A. Hence, the e280 of native Sis 46,137 M" 1 cm* 1 . 
1 0 Further understanding of the individual subunits the present 

invention also determines whether 8 and 8' are monomeric, dimeric o? 
higher order structures. The 8 and 8' subunits were also each analyzed 
in a gel filtration column, and they migrated in essentially the same 
position as one another (fractions 30-32), As discussed below the 5* 
1 5 appears as two proteins, 8'l and 8*8, which differ by approximately 0.5 
kda. Comparison with protein standards of known Stokes radius yielded 
a Stokes radius of 26.5A for 8 and 25. 8A for 8', slightly smaller than the 
27.5A radius of the 43.5 kDa ovalbumin standard indicating both 8 and 8' 
are both monomeric (their gene sequences predict: 8, 38.7 kDa 8', 36.9 
20 kDa). In a glycerol gradient sedimentation analysis both 8 and 8' 

migrated in the same position as one another with an S value of 3.0 
relative to protein standards, a slightly lower sedimentation value than 
tha 43.5kda ovalbumin standard, again indicating a monomeric state for 
tha 8 and 8'. Besides protein mass, the protein shape is also a 
2$ determinant of both the Stokes radius and the S value obtained by these 
techniques. The shape however, causes opposite behavior In these two 
techniques, a protein with an asymmetric shape behaves in gel 
filtration as a larger protein than if it were spherical (elutes early) and 
behaves in sedimentation like a smaller protein than if it were 
30 spherical (sediments slower). The Stokes radius and S value can be 
Combined in the equation of Siegel and Monty whereupon the protein 
shape factor cancels. Therefore, the native mass of the protein 
obtained from such treatment is more accurate than calculating the 
mass from only the S value or the Stokes radius and assuming a 
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spherical shape. This calculation yielded a native mass of 34.7 kDa for 
8 and 33.8 kDa for 8'; values similar to the monomer molecular mass 
predicted from the genS sequences of 5 and 8\ further evidence they are 
monomers. Their frictonal coefficients are each significantly greater 
5 than 1.0 indicating they are not spherical but have Some asymmetry to 
their shape. One cph also conclude from this Work that the two 8' 
subiiriits are a mixture t of 8'L and S's rather than a complex of 8'L and 8's. 

In initial studies' using the cloned 8, 8 forms only a Weak complex 
With y but, together with 8* a stable 788' complex can be reconstituted 

1 0 which remains intact in gel filtration and ion exchange chromatography. 
Likewise, 8* forms only a Weak complex with y, and requires the 8 
subunit to bind y tightly. Both 8 and 8' appear monomeric and bind to 
each other to form a 85' heterodimer. 

Availability of the 8 subunit in large quantity will allow detailed 

1 5 studies of the mechanism of the y complex in 8 clamp formation. 

Further, identification of the 8 gene will provide for genetic analysis 
(essentiality) of 8 in E. coli replication and possibly other roles of 8 in 
DNA metabolism. 

The second subunit according to the present invention, that of 8\ 

20 was also identified from the 88' fraction in like manner. The N-terminal 

sequence, comprising the first 18 amino acids in the peptide, and the 

tryptic peptide sequence were obtained. The amino acid sequence 

determined from the initial sequence studies for the 8' peptide is: 

Met Arg Tip Tyr Pro Trp Leu Arg Pro Asp Phe Glu Lys Leu Val 
25 '5 10 15 

Ala Ser Tyr Gin Ala Gly Arg Gly His His Ala Leu Leu He Gin 
20 25 30 

Ala Leu Pro Gly Met Gly Asp Asp Ala Leu He Tyr Ala Leu Ser 
35 40 45 

3 0 Arg Tyr Leu Leu Cys Gin Gin Pro Gin Gly His lys Ser Cys Gly 

50 55 60 

His Cys Arg Gly Cys Gin Leu Met Gin Ala Gly Thr His Pro Asp 
65 70 75 

Tyr Tyr Thr Leu Ala Pro Glu Lys Gly Lys Asn Thr Leu Gly Val 
3 5 80 85 ^90 

Asp Ala Val Arg Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg 
95 100 105 
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Leu Gly Gly Ala Lys< Val Val Trp Val Thr Asp Ala Ala Leu Leu 

110 115 120 

Thr Asp Ala Ala Ala Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro 

125 130 135 

5 Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr Arg Glu Pro Glu Arg 

140 145 150 

Leu Leu Ala Thr LeU Arg Ser Arg Cys Arg Leu His Tyr Leu Ala 

1$5 ! 160 165 

Pro Pro Pro Glu Gin -Tyr Ala Val Thr Trp Leu Ser Arg Glu Val 
10 170' 175 180 

Thr Met Ser Gin Asp Ala Leu Leu Ala Ala Leu Arg Leu Ser Ala 

185 190 195 

Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly Asp Asn Trp 

200 205 210 " 

1 5 Gin Ala Arg Glu Thr Leu Cys Gin Ala Leu Ala Tyr Ser Val Pro 

215 220 225 

Ser Gly Asp Trp Ty r Ser Leu Leu Ala Ala Leu Asn His Glu Gin 

230 , 235 240 

Ala Pro Ala Arg Leu His Trp Leu Ala Thr Leu Leu Met Asp Ala 
20 245 250 255 

Leu Lys Arg His His Gly Ala Ala Gin Val Thr Asn Val Asp Val 

260 265 270 

Pro Gly Leu Val Ala Glu Leu Ala Asn His Leu Ser Pro Ser Arg 

275 280 285 

2 5 Leu Gin Ala He Leu Gly Asp Val Cys His He Arg Glu Gin Leu 

290 295 300 

Met Ser Val Thr Gly He Asn Arg Glu Leu Leu He Thr Asp Leu 

305 310 315 

Leu Leu Arg He Glu His Tyr Leu Gin Pro Gly Val Val Leu Pro 

3 0 320 325 330 

Val Pro His Leu 
334 

From these sequences, two DNA oligonucleotide probes were 
made and used (after end-labelling with 32 P for use in Southern blot 

3 5 analyst) to probe a Southern blot of £ coli DNA which was grown, 

isolated and restricted as above. The sequences of the two probes 
were: 

probe 1: 

ACT CTG GAA GAA CCG CCG GCT GAA ACT TGG TTT TTT CTG GCT . 42 

4 0 ACT CGT GAA CCG GAA 57;and 

probe 2: 
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GCT GGT TCT CCG GGT, GCT GCT CEG GCT CTG TTT CAG GGT GAT 42 
GAC TGG CAG GCT 54. 

Of the two Southern blots analyzed (one With the 57-mer probe 
and the other with the 54-mer probe), the patterns from the blots had 
5 one Set of bahds In common, and these were sized by comparison with 
size standards In thp t same gel following recognized techniques. The 
stee of thesd 8 coiftmqri "bands 11 or DNA fragments produced by digestion 
With 8 restriction erizyftles were used to scan, by eye, the restriction 
map Of the E. coll genome [see Cell 50:495 (1987)). One unique location 
1 0 oh the genome was located which was compatable With ail 8 restriction 
fragment sizes. 

Phage X236 was selected as a phage containing the "unique 
location" irt th& £ coli gefidme. The 8' gene was excised frorii the A,236 
phage uSlhg Restriction enzymes EcoRV and Kpnl to jrieid a £.3 kb 
1 5 fragment of DNA. this fragment was then ligated into pUCi8 and 

sequenced using a sequenase kit (US Biochemicais) in accordance with 
the manufacturer's instructions. The fragment was also ligated into a 
Ml3mpi8 vector for making a site specific mutation, as described 
above, at the ATG start, codon (i.e., changing the CGCATG to CATATG; 
20 thereby allowing Ndel to cleave the nucleotide at CATATG, whereas it 
could not cleave the nucleotide using the normal CGCATG sequence). 

The nucleic acid sequence obtained from these studies predicted 
the amino acid sequence determined for 8* peptide in frame, and thus the 
selected sequence was that for the 5' gene. The nucleic acid sequence, 
25 according to the present invention, for this second subunit, 8\ is: 
ATG AGA TGG TAT CCA TGG TEA CGA CCT GAT TTC GAA AAA 39 
CTG GTA GCC AGC TAT CAG GCC GGA AGA GGT CAC CAT GCG 78 
CTA CrC ATT CAG GCG TEA CCG GGC ATG GGC GAT GAT GCT 117 
TEA ATC TAC GCC CTG AGC CGT TAT TEA CTC TGC CAA CAA 156 
3 0 CCG CAG GGC CAC AAA AGT TGC GGT CAC TGT CGE GGA TGT 195 
CAG TEG ATG CAG GCT GGC ACG CAT CCC GAT TAC TAC ACC 234 
CTG GCT CCC GAA AAA GGA AAA AAT ACG CTG GGC GTE GAT 273 
GCG GEA CGT GAG GTC ACC GAA AAG CTG AAT GAG CAC GCA 312 
CGC TEA GGT GGT GCG AAA GEC GTE TGG GTA ACC GAT GCT 351 




arc, tta cta acc gag, gcc G rc rct aac oca TTG CTG aaa, 390 
Ann mr gaa gag era oca cta gaa act Tffl TTT TTC CTG 429 
Arr. cgc gag nrr gaa a ?r tta cro cta aga TTA CGT 468 
ACT CGT TGT CGG TTA CAT TAC PTT GCG CCG CCG CCG GAA 507 
5 pap; tac gcc gtg act, tog c tt TCA CGC GAA, GPG ACA ATG 546 
TCA. CAG GA.T GCA TTA CTT GCC GCA TTG CGC TTA AGC GCC 585 
amr.iTG CCT aac. gcg OTA rm gcg ttg TTT CAG GGA GAT 624 
A A T TGG HAG GOT CGT GAA ACA TTG TGT CAG GCG TTG GCA 663 
TAT AGC GTG CCA TCG GGC GAT TGG TAT TCG CTG CTA GCG 702 
10 GCC CTT AAT CAT GAA CAA GTC CCG GCG CGT TTA CAC TGG 741 
CTG GCA ACG TTG CTG ATG GAT GCG CTA AAA CGC CAT CAT 780 
GGT GCT GCG CAG GTG ACC AAT GTT GAT GTG CCG GGC CTG 819 
GTC GCC GAA CTG GCA AAC CAT CTT TCT CCC TCG CGC CTG 858 
CAG GCT ATA CTG GGG GAT GTT TGC CAC ATT CGT GAA CAG 897 
15 TTA ATG TCT GTT ACA GGC ATC AAC CGC GAG CTT CTC ATC 936 
ACC GAT CTT TTA CTG CGT ATT GAG CAT TAC CTG CAA CCG 975 
GGC GTT GTG CTA CCG GTT CCT CAT CTT 1002 

The underlined portions of this sequence refer to subunits which 
are 8'-1 (283-315), 8'-2 (316-327), 8'-3 (328-390), 5'-4 (391-462), 8'-5 
20 (481-534), and S'-6 (577-639). In addition, the upstream sequence: 
AAGAATCTTT CGATTTCTTT AATCGCACCC GCGCCCGCTA TCTGGAACTG 50 
GCAGCACAAG ATAAAAGCAT TCATACCATT GATGCCACCC AGCCGCTGGA 100 
GGCCGTGATG GATGCAATCC GCACTACCGT GACCCACTGG GTCAAGGAGT 150 
TGGACGC 157 

2 5 contains an underlined putative translational signal: Shine-Dalgarno. 

In addition, the downstream nucleic acid sequence for 8' begins 
With a stop codoh: 

TEA GAGAGACATC ATGTTTTTAG TGGACTCACA CTGCCATCTC 43 
GATGGTCTGG ATTATGAATC TTTGCATAAG GACGTGGATG ACGTTCTGGC 93 

3 0 GAAAGCCGCC GCACGCGATG TGAAATITTG TCIGGCAGTC GCCACAACAT 143 

The 8' gene {holB) was then subcloned into Ml3mp18, and a Ndel 
site was created at the initiating codon as described above. The 5' gene 
was then excised from M13 using Ndel restriction enzyme and a second 
enzyme which cut downstream of 8', and the excised gene was subcloned 
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into the pET-3c overexpression plasmid using the same techniques 
described above. Following overexpression of the 8' protein, the protein 
was purified using a Fast flow Q - Heparin - Hexytamine techtque as 
described herein. Ninety frig of 5* protein was obtained from 4 liters of 
5 ceils. 

Further studies on the 8* gene were conducted to make certain 
th&t the gene sequence obtained from these research was actually the V 
gene and not some artifact. These studies showed that the gene 
Sequence according to the present invention predicted all the peptide 

1 0 sequence information, that the cloned 8' gene comigrates with the 
naturally occuring gene on a 13% SDS-PAG gel, that the cloned 5' gene 
stimulates the 5 protein system as does the naturally occuring 8', and 
that 8* forms a 8'S complex with 8 in a similar manner to that which 
occurs with the naturally occurring 8' and 8. 

1 5 With specific regard to the isolation and characterization of 5' 

Shd holB according to the present invention, the amino acid sequencing 
Wa§ conducted using 8 and 8' subunits purified to apparent homogenicity 
through the ATP-ag&rose column step [see J. Biol. Chem. 265:1179 
(1990)] from 1.3 kg of the y/x overproducing strain of E. coli : 

20 HBl0l(pNT203, pSK100), [see J. Biol. Chem. 263:6555 (1988)]. The Sand 
8' subunits were separated on a 13% SDS polyacrylamide gel whereupon 
the 8' resolved into two bands. The slower and faster migrating 8' bands 
are referred to as 8'L (large) and 8's (small), respectively; 8's was 
approximately 2 times the abundance of S*l. Both 8'l and 8's were 

25 electroblotted onto PVDF membrane (Whatman) [see J. Biol. Chem. 

262:10035 (1987)] for N-terminal sequencing (50 pmol each of 8'l and 
8's). and onto nitrocellulose membrane (Schleicher and Schuell) [see 
Proc. Natl. Acad. Sci. USA 84:6970 (1987)] for tryptic analysis (90 pmol 
Of 8'L and 180 pmol of 8's). Proteins were visualized by Ponceau S stain 

3 0 (Sigma). 

Analysis of the more abundant 8's was as follows: the N-terminal 
sequence was: 

NH2-Met Arg Trp Tyr Pro Pro Leu (Arg)(Pro) Asp Phe Glu Lys Leu Val Ala 

5 10 15 

3 5 and the tryptic peptides were: 



8'-1: 

NH2-Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg; 

5 10 

5'-3: 

5 NH2-Val Val Trp Val Thr Asp Ala Ala Leu Leu Thr Asp 

5 10 
Ala Ala Ala Asn Ala'' Leu Leu Lys 
15 • 20; 

5'-4: 

i 0 NH2~Thr Leu Glu Glu Pro Pro Ala Glu Thr Trp Phe Phe Leu Ala 

5 10 
Thr Arg Glu Pro (Glu) (Arg) Leu Leu Ala Thr (Leu); 
15 20 

5'-5: 

1 5 NH2-Leu His Tyr Leu Ala Pro Pro (Pro) Glu Gin Tyr Ala Val 

. 5 10 
Thr (Trp) Leu Ser Arg; and 
15 

8'-6: 

2 0 NH2-Leu Ser Ala Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin 

5 10 
Gly Asp Asn Trp Gin Ala Arg. 
15 20 

Sequence analysis of tryptic peptides of the less abundant 5'|_ 

2 5 were: 

8'-2: 

NH2~Leu Gly Gly Ala Lys; and 

5 

8'-7 (same as 8'-3): 

3 0 NH2~Val Val Trp Val Thr Asp Ala Ala Leu Leu Thr Asp 

5 10 
Ala Ala Ala Asn Ala Leu Leu Lys 
15 20; 

Parenthesis in the above sequences indicate uncertain 

3 5 assignments. 

Two synthetic oligonucleotide probes (DNA oligonucleotides, 

Oligos etc. Inc.) were designed from the sequence of two of the tryptic 

peptides and the codon usage of E. coli with allowance for a T-G 
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mispair at the wobble' position. A synthetic DNA 57-mer probe was 
based on the sequence of 8'-4 (amino acids 131-149): 

Ma Cys Thr Cys Thr Gly Gly Ma Ma Gly Ma Ma Cys Cys Gly 

5 10 15 

5 Cys Cys Gly Gly Cys Thr Thr Gly Ma Ma Ala Cys Thr Thr Gly 

10 25 30 

Gly Thr Thr Thr Thr Thr Thr Cys Thr Gly Gly Cys Thr Ma Cys 
35, 40 45 

Thr Cys Gly Thr Gly -Ma Ma Cys Cys Gly Gly Ma Ma 
10 50 55 

(after identification and sequencing of holB this probe was incorrect at 
1 1 positions). A DNA 54-mer probe was based on the sequence of 8-6 
(amino acids 195-212): 

Gly Cys Thr Gly Gly Thr Thr Cys Thr Cys Cys Gly Gly Gly Thr 
15 5 10 15 

Gly Cys Thr Gly Cys Thr Cys Thr Gly Gly Cys Thr Cys Thr Gly 
20 ' 25 30 

Thr Thr Thr Cys Ma Gly Gly Gly Thr Gly Ma Thr Ma Ma Cys 
35 40 45 

20 Thr Gly Gly Cys Ma Gly Gly Cys Thr 

50 ' 

(after identification and sequencing of holB the probe Was incorrect at 
9 positions. These probes (100 pmol each) were 5' end-labelled with 
1nM fj-3 2 P]ATP (radiohucleotides, Dupont-New England Nuclear) and 

25 polynucleotide kinase. £. coli genomic DNA (strain C600) was 

extracted (see J. Mol. Bio. 3:208 (1961)] and restricted with either 
BamHI, HindllL, EcoRI, EcoRV, Bgll, Kpni, Pstl or Pvull (DNA modification 
enzymes, New England Biolabs) and then each digest was 
electrophoresed in a 0.8% native agarose gel followed by depurination 

3 0 (0.25 M HCI), denaturation (0.5 M NaCI) and then neutralized (1 M Tris, 2 
M NaCI, pH %.0) prior to transfer to Gene Screen Plus (DuPont-New 
England Nuclear) for Southern analysis using a Vacugene appartus 
(Pharmacia) in the presence of 2XSSC (0.3 M NaCI. 0.3M sodium acetate, 
pH 7.0). Conditions for hybridization and washing using these 

3 5 oligonucleotide probes were determined empirically and the desired 
results were obtained using a hybridization temperature of 42°C then 
washing with 2XSSC and 0.2% SDS at successively higher temperature 



until evaluation by autoradiography showed a single band in each lane 
for the 57-mer, and two bands in each lane for the 54-mer (this 
occurred at 53°C for both probes). Although the 54-mer showed two 
bands in each lane, one band always matched the position of the band 

5 probed with the 57-mer. 

The 2.1 kb Kpnl/EcoRV fragment containing holB was excised 
from \ E9G1(236) (see Cell 50:495(1987)] and directiohally ligated into 
PUC18 (Kpnl/Hincll) td yield pUC-6'. Both strands of DNA Were 
sequenced by the chain termination method of Sanger using the United 

10 States Biochemicals sequenase kit, [cr 35 S]dATP, and synthetic DNA 18- 
mers. 

A 2.1 kb Kpnl/Hindlll fragment containing the holB gene was 
excised from pUC-5' and directionally ligated into M13mpl8 to yield 
M13-8'. An Ndel site was generated at the start codon of holB by 
15 oligonucleotide site directed mutagenesis [see Methods Enzymol 
154:367 (1987)] using a DNA 33-mer: 

Gly Gly Thr Gly Ala Ala Gly Gly Ala Gly Thr Thr Gly Gly Ala 

5 10 15 

f frfl Ala Thr Ala Thr Glv A la Gly Ala Thr Gly Gly Thr Ala Thr 
20 20 25 30 

Cys Cys Ala 

containing the Ndel site (underlined) at the start codon of holB to prime 
replication of M13-8' viral ssDNA and using SSB and DNA polymerase III 
holoehzyme (in place of DNA polymerase I) to replicate the circular 

25 template without strand displacement. The M13 chimera is called M13- 
8*-Ndel. And Ndel fragment (1160bp) containing the holB gene was 
excised from Ml3-S'-Ndel and ligated into pET3c, linearized using Ndel, 
to yield pET-8'. the orientation of the holB gene in pET-5' was 
determined by sequencing. 

3 0 Reconstitution assays contained 108ng M13mp18 ssDNA (0.05 

pmol as circles) uniquely primed with a DNA 30-mer [see J. Biol. Chem 
266:11328 (1991)], 1.5 u.g SSB (21 pmol as tetramer), 30ng 6 (0.39pmol 
as dimer), 22.5 ng ae complex (0.14 pmol), 20 ng y (0.12 pmol as dimer), 
2 ng 8 (0.5 pmol as monomer) and the indicated amount of 8' (or 1-5 ng 

3 5 of column fraction during purification) in 20 mM Tris-HCI (pH 7.5), 8 




mM MgCl2, 5 mM DTT, 4% glycerol, 40 (lg/ml BSA, 0.5 mM ATP, 60 uM 
dGTP, and 0.1 mM EDTA In a final volume of 25 u.l (after the addition of 
the remaining proteins). Assays of y or t activity with either 5, 8' or 55', 
contained either 2 ng 6 (0.05 pmol as monomer), 2 hg 6' (0.05 pmol as 

5 monomer), or 1 ng (0.025 pmol) each of 5 and 8', and the Indicated 
amount of y or t. A|i proteins were added to the assay on led and then 
shifted to 37°C for 8. minutes to allow reconstitution of the pocessive 
polymerase on the primed ssDNA. DNA synthesis was initiated upon 
rapid addition of 60 |iM dATP and 20 uM [a32p]TTP, then quenced after 

1 0 20 Seconds and quantitated using DE81 paper. When needed, proteins 
were diluted in 20 mM Tris-HCI (pH 7.5), 2 mM DTT, 0.5 mM EDTA, 20% 
glycerol, and 50 jig/ml BSA. Proteins used in the reconstitution assays 
were purified [see J. Biol. Chem 266:9833 (1991). The concentration of 
8 and 8 were determined by absorbance using an e280 value if 17,900M" 

1 5 1 cm' 1 , and 46,137m-" 1 cm' 1 , respectively. Concentrations of a, e, y, x 

and SSB were then determined [see Anal. Biochem 72:248 (1976)] using 
BSA as a standard. The concentration of 8' was determined by 
absorbance using ah 6280 value of 60,136 M-icm- 1 . 

ATPase assays were performed in a final volume of 20 ui 
20 containing 20 mM Tris-HCI (pH 7.5), 8 mM MgCl2 and contained 285 ng 
Ml3mp18 ssDNA. ATPase assays of y, 8, 8' 88', yb and yS' with and without 
p contained lOO^iM [r 32 P] ATP and when present 376 ng y (4 pmol as 
dimer), 304 ng 8 (7 pmol as monomer). 296 ng 8' (8.0 pmol as monomer), 
and 320 ng B (4.2 pmol as dimer). Proteins were added on ice, shifted to 

2 5 37°C for 30 minutes, then 0.5 ml was spotted on a plastic backed thin 

layer of chromatography (TLC) sheet coated with Cel-300 
polyethyleneimlne (Brinkman Instruments Co.). To assay the more active 
ATPase activity of 788' and t, 300 fiM ATP was used, less total protein 
and less time at 37°C inorder to assess the initial rate of reaction. 

3 0 Therefore, ATPase assays of 788, t, 18, t8" and t88' With and without p 

contained 300 mM [y' 32 P] ATP and when present, 47 ng y(0.5 pmol as 
dimer), 71 ng x (0.5pmol as dimer), 38ng 8 (1pmol as monomer), 37 ng 8* 
(1 pmol as monomer) and 40 ng p (0.5 pmol as dimer. Proteins were 
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added on ice, shifted ,to 37°C for 10 minutes, then analyzed by TLC as 
described above. 

TLC sheets were developed in 0.5 M lithium chloride, 1 M formic 
acid. An autoradiogram of the TLC chromatogram was used to visualize 
5 the free phosphate at the solvent front and ATP at the origin which 
were then cut from t the TLC sheet and quantitated by liquid 
Scintillation. The amount of ATP hydrolyzed was calculated as the 
percent of total radioactivity located at the solvent front (Pi) times 
the total moles of ATP added to the reaction. 

1 0 The results of the 8* studies appear below: 

The naturally purified 5' (resolved from the y complex) appears in_ 
a 13% SDS polyacrylamide gel as two bands of approximately 37 kDa 
that differ in size by aboutl kDa. The larger protein (8'l) is 
approximately one half the abundance of the smaller one (8's). Both 51 

1 5 and 8's are believed encoded by the same gene as there was no 
noticeable difference in their HPLC profiles upon digestion with 
trypsin. In support of this, peptides from 8's and 5'l that had the same 
retention time on HLPC analysis also had identical amino acid 
sequences (peptide 8-7 from 5 ? s and 8X3 from 8'L were identical). The 

20 N-terminus of 8"s and five tryptic peptides of 8's and two tryptic 
peptides of 8'l were sequenced. 

A search of the GenBank revealed no match to the N-terminal 
sequence or to any of the tryptic peptides from either 8'l or 8*s. Two 
best-guess oligonucleotide probes (a 57-mer and a 54-mer) were 

25 designed from' tryptic peptides S*-4 and 5'-6 based on the codon usage 
frequency in E. coli [see PNASUSA 80:687 (1983)]. The oligonucleotide 
probes were used in a Southern analysis of E. coli genomic DMA 
digested with each of the eight Kohara restriction map enzymes. 
Imposing the restraint that the eight restriction fragments from the 

3 0 Southern analysis must overlap the holB gene, the Kohara map of the E. 
coli chromosome was searched and only one position of overlap at 24.3 
riiiriutes (1,174 kb on the E. coli chromosome starting from thrA) was 
found which satisfied the fragment sizes. The fragment sizes in the 
Kohara map and from the Southern analysis are given in the following 
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table which depicts the correspondence of the observed size of genomic 
DNA restriction fragments with the Kohara restriction map of the E. 
coli chromosome in the region of 24 minutes. E. col! genomic DNA was 
digested with the restriction enzymes indicated. The size of the 
5 restriction fragments that were in common for both the 57-mer and 
54-rner probes in the .Southern analysis and also the corresponding 
sizes of the restriction fragments on the Kohara restriction map of the 
E. coli chromosome at 24.5 minutes are listed - below. 
Restriction Size of restriction fragment (kb) 



1 0 enzvme Southern Kohara map 

Pstl 1.7 1.9 

Bgil 4.25 4.2 

Kpnl 6.6 6.4 

EcoRV 7.0 6.8 

i5 Pvull 6.2 6.2 

EcoRI >15 16.2 

Hindlll >20 30 

BamHI >25 38 



The Kohara X phage E9G1(236) contains 16.2 kb of DNA 
20 surrounding the putative holB gene. A 2.1 Kpnl/EcoRV fragment 

cohtaining holB was excised from X E9G1(236), cloned into pUC18 and 
sequenced. The sequence of the Kpnl/EcoRV fragment revealed an open 
reading frame of 1002 nucleotides which predicts a 334 amino acid 
protein of 36.9 kDa (predicted pi of 7.04), consistent with the mobility 
25 of 8' in a SDS polyacrylamide gel. The open reading frame encodes the 
N-terminal sequence and all six tryptic peptide sequences obtained 
from 8'L and 8'S- 

Analysis of the DNA sequence upstream of the open reading frame 
revealed a putative translation initiation signal (Shine-Dalgarno 

3 0 sequence) 8 nucleotides upstream of the ATG initiating codon. No 

obvious transcription initiation signals were detected upstream of the 
initiation codon leaving open the possibility that holB is in an operon 
With an upstream gene(s). Alternatively, the transcription initiation 
signals may poorly match the consensus signals and thereby be ... 

3 5 unrecognizable, as a low level of transcription would not be unexpected 
for a gene encoding a subunit of the holoenzyme present at only 10-20 
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copies/cell. The holB, gene uses several rare codons [TTA (Leu), ACA 
(Thr), GGA (Gly), AGC, TCG (Ser)] 2-4 times more frequently than 
average which may decrease translation efficiency. 

The holB sequence contains a helix-turn-helix consensus motif 
5 (Ala/GlyX3GlyX5lle/Val) at Ala80Gly84Valgo although ability of 8' to 
bind DNA has yet to be examined. There is also a possible leucine zipper 
(Leu7X6Leui4X6Gly2jX6Leu28) in the N-terminus although Gly 
interrupts the Leu pattern. The holB sequence does not contain 
consensus sequences for motifs encoding an ATP-binding site or a zinc 
10 finger. The molar extinction coefficient of 8' calculated from its 8 Trp 
and 11 Tyr residues is 59,600M- 1 cnr 1 which is only 0.9% lower than - • 
that observed in the presence of 6M guanidine hydrochloride for a native 
extinction coefficient of 60.136M- 1 cm-1 . 

To obtain the 8' subunit in large quantity, an expression plasmid 
15 was constructed. The holB gene was first cloned into Ml3mpl8 

followed by site directed mutagenesis to create an Ndel site at the 
initiating methionine to allow precise subcloning of holB into the pET3c 
expression vector. The holB gene was excised from the M13-8'-Ndel 
mutant using Ndel followed by insertion into the Ndel site of the pET3c 
20 expression vector [see Methods Enzymol 185:60 (1990)] which places 
holB under the control, of the 17 RNA polymerase promotor of T7 gene 
10 and the efficient Shine-Dalgarno sequence of gene 10. The pET-8' 
construct was transformed into BL21(DE3)plysS cells which harbor a X 
lysogen containing the T7 RNA polymerase gene controlled by the lac 

2 5 UV5 promoter.' Upon induction of T7 RNA polymerase with IPTG, the 8' 

protein was expressed to 50% of total cell protein. Cell lysate prepared 
from the induced cells containing pET-8' was 5600-fold more active in 
the replication assays than cell lysate prepared from induced cells 
containing the pET3c vector as described below. 

3 0 Three hundred liters of BL21(DE3)plysS cells harboring pET-8" . 

were grown at 37°C in LB media supplemented with 5 mg/ml glucose, 
10 ug/ml thiamine, 50 ug/ml thymine containing 100 ng/ml ampicillin 
and 25 \ig/m\ chloramphenicol. Upon growth to an OD600 of 0.6, IPTG 
was added to 0.2 mM. After further growth for two hours the cells (940 
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g) were collected by .centrifugation, resuspended in an equal weight of 
50 mM Tris-HCi (pH 7.5), 10% sucrose (Tris-Sucros&) and stored at 
-70°C. 100 g of cells (30 liters of cell culture) were thawed 
Whereupon they lysed (due to lysozyme produced by plysS) and to this 
5 was added 250 ml Tris-Sucrose, DTT to 2 mM and 40 mt of 10x heat 
lysis buffer (50 mMjris-HCI (pH 7.5), 10% sucrose, 0.3M Spermidine, 1M 
NaGl). The cell debri,s Was removed by centrifugation to yield the cell 
lysate (Fraction I, 4.41 g in 325 ml). The purification steps that 
followed were performed at 4°C. The reconstitution activity assay for 
10 5* is as described previously. Ammonium sulphate (0.21 g/ml) Was 
dissolved in the clarified cell lysate and stirred for 90 minutes. The 
precipitated protein containing 5' was pelleted (Fraction II, 1.58 g) and 
redissolved in 660 ml of 30 mM Hepes-NaOH (pH 7.2), 10% glycerol, 0.5 
mM EDTA, 2 mM DTT (buffer A) and dialyzed against two successive 

1 5 changes of 2 liters each of buffer A to a conductivity equal to 40 mM 

NaCL The Fraction II Was loaded onto a 300 ml heparin agarose column 
(BioRad) equilibrated with buffer A. The heparin column was washed 
with 450 ml buffer A plus 20 mM NaCI, then eluted over a period of 14 
hours using a 2.5 liter linear gradient of 20 mM NaCI to 300 mM NaCI in 
20 buffer A. One hundrfed fractions were collected. Fractions 36-53 were 
pooled (Fraction III, 550 ml, 990 mg) and dialyzed twice against 2 
liters of 20 mM Tris-HCI (pH 7.5), 10% glycerol, 0.5 mM EDTA, 2 mM DTT 
(buffer B) to a conductivity equal to 60 mM NaCI. The Fraction III was 
loaded onto a 100 ml Q sepharose column (Pharmacia) equilabrated with 

2 5 buffer B. The loaded Q sepharose column was washed with 150 ml of 

buffer B plus 20 mM NaCI then eluted over a period of 12 hours using a 
i.2 liter linear gradient of 20 mM NaCI to 300mM NaCI in buffer B. 
Eighty fractions were collected. Fractions 34-56 were pooled (Fraction 
IV, 781 mg in 370 ml) and dialyzed twice against 2 liters each of buffer 

3 0 B to a conductivity equal to 60mM NaCI just prior to loading onto a 60 

ml EAH sepharose column (Pharmacia) that Was equilibrated with buffer 
8. The loaded EAH sepharose column was washed With 60 ml of byffer B 
plu£ 40 mM NaCI then eluted over a period of 10 hours Using a 720 ml 
linear gradient of 40 mM NaCI to 500 mM NaCI in buffer B. Eighty 



fractions were collected. Fractions 18-30 (Fraction V, 732 mg in 130 
ml), which contained homogeneous 5' were pooled and dialyzed against 2 
L buffer B (lacking DTT to allow an absorbance measurement, see 
below) to conductivity of 40 mM NaCI. Fraction V was passed over a 5 
5 ml ATP-agarose column (Pharmacia, Type II, N-6 linked) to remove any y 
complex contaminarjt .followed by addition of DTT to 2 mM and then was 
aliquoted and stored at -70°C. Protein concentration was determined 
Using BSA as a standard except at the last step in which concentration 
was determined by absorbance using e280=60,l36M- l cm- 1 . 

10 Step total total specific fold % 

protein units 1 activity purification yield 
(mg^ (units/ma) . 

I Lysate 2 - 3 4414 3.0x10 1 7x10 6 1.0 100 

il Ammonium Sulfate 1584 2.5x10 10 16x10 6 2.3 83 

15 Ml Heparin 990 2.6x1010 2 6x1o6 3.7 87 

IV QSepharose 781 2.6x1 0^ <> 33x1 0^ 4.7 87 

V EAH-Sepharose* 732 2.5x10^ 34x10** 4.9 83 
1 One unit Is defined as pmol nucleotide Incorporated in 20 seconds 

2|_ysate of BL21(DE3)plysS cells harboring the pET3c vector yielded a specific activity of 
20 1252 units/mg. 

3omission of 7 and 8 from the assay of the lysate resulted in a 7650-fold reduction of 
specific activity (915 units/mg). 

4 Using pure 5", omission of 7 from the assay gave no detectable synthesis under the 
conditions of the assay. 

25 The purified overproduced 5' stimulated yS 30-fold in its action in 

reconstituting the" processive holoenzyme from the ae polymerase and 
the p clamp accessory protein. In this assay the 8' is titrated into a 
reaction containing a low concentration of y and 8. and also contains the 
p subunit, ae polymerase and Ml3mp18 ssDNA primed with a synthetic 

3 0 olignucleotide and coated with SSB. The proteins were preincubated 
with the DNA for 8 minutes to allow time for the accessory proteins to 
form the preinitiation complex which contains the p clamp and for ae to 
bind the preinitiation complex. DNA synthesis is initiated upon addition 
of deoxyribonucleoside triphosphates and the reaction is stopped after 

3 5 20 seconds which is sufficient time for the processive reconstituted 
polymerase to complete the circular DNA. Although a processive 
polymerase can be reconstituted without the 8' subunit, under the 



conditions used in the present invention in which y and 8 are at low 

concentration, the 5' subunit stimulates the reaction greatly (30-fold). 

The 8' subunit saturated this assay at a level of approximately one 

molecule of 5' to one molecule of 5. 
5 Both the x and y subunits of the holoenzyme are encoded by the 

game gene (dnaX). , T : he y subunit is formed as a result of a -1 frarheshift 
during translation with, the result that y is only 2/3 the length of t due 
to ah earlier translation^ stop codon (within 2 codons) in the -i 
reading frame. The activity of the y and x proteins in reconstituting the 
1 0 processive polymerase Was compared us5ng either the 8, 8' or both 88' 
subunits in the presence of ae complex and p subunit (Fig.6A and 6B). In 
the absence of 8 and 8', the y subunit alone displays Insignificant 
activity in the reconstitution assay although when a large amount of y 
was present it had very little, but detectable, activity (Fig. 6A). The 8 
1 5 subunit provides y with activity in the reconstitution assay, but 8' does 
not provide ywith activity. However, the cloned 8' subunit, When 
present with 8, markedly stimulated the activity of the y and. 8 mixture 
such that maximal activity was achieved at much lower concentrations 
of added y. 

20 The x subunit alone, like y, was also essentially inactive in the 

reconstitution assay, although at very high amounts of x a slight, but 
reproducible amount of activity was observed, x is active with 8 in this 
assay although more t (50-fold) than y is needed for comparable 
activity. Previously it was observed that x was Unlike y in that x was 
25 active with 8* in the reconstitution assay in the absence of any 8 
subunit (only x, 8' and a, e, p were needed). Consistent with these 
previous results, the 5 1 subunit is active with x in the absence of 8 
(similar to the activity of x and 8 in the absence of 8'). With both 8 and 
8' present, only a small amount of x subunit is required for maximal 
3 0 activity in the reconstitution assay. The activity of t88' parallels that 
of ySS" and requires 500-fold less x for maximal activity than either x8 
or x8'. Hence, both the y subunit and the x subunit are highly active, in 
this reconstitution assay when both 8 and 8' are present. 



The effect of the 6, 8* and p subunits on the DNA dependent ATPase 
activity of t was quite different fromn their effect on y, the close 
relative of the x subunit. The t subunit, by itself. Is a much more active 
DNA dependent AtPase than y and, in fact turns over two times more 
5 ATP than the y88' complex. Unlike the y ATPase, the % ATPase was 
essentially unaffected, by p or by 8 With or Without p or by 8' With or 
Without p. However, like the y ATPase, the presence ot both 8 and 8' 
stimulated the x ATPase, although the effect Was only 4-fold compared 
to the 30-fold stimulation of y by 88'. Whereas p stimulated the y88' 
1 0 ATPase 3-fold, the p subunit did not stimulate the y88' ATPase at all, in 
fact p slightly inhibited it, yet the x86' complex is as active as y8S' in 
reconstituting a processive polymerase with p and ae. 

The cloned 8' preparation appears as a doublet in a 13% SDS 
polyacrylamide gel and the two polypeptides are of the same size and 
15 molar ratio (2:1, lower band-to-upper band) as the 8' doublet purified 
from the y complex. Electrospray mass spectometry revealed that the 
smaller polypeptide (8's) was the size predicted from the gene sequence 
and the larger polypeptide (8'l) was increased in size by 521 Da. The 
nature of the larger polypeptide is presently under investigation. 
10 Possibilities include rriRNA splicing, use of an upstream translational 
start signal, readthrough of the stop codon, translational frameshifting, 
and posttranslational modification. Whatever the mechanism which 
profuces 8'l it must be efficient since the highly overproduced 8' still 
produces the same level of 8'l relative to the 8's and 8'l within the 
25 holoenzyme. irrespective of how 8'l is synthesized, the fact remains 
that 8'l and 8's are different. Presumably they also have functional 
differences as in the case of the related yand x subunits. Whereas x and 
y both appear to be within each holoenzyme molecule, it reamins to be 
shown whether the 8'L and 8's subhunits are on one or on different 

3 0 holoenzyme molecules. 

Sequence analysis of 8'L and 8's show they have identical N- 
termini proving 8'l is not derived from an alternate upstream ATG start 
site. Translational readthrough of the stop codon was considered as an 
explanation which would produce a protein containing 19 additional 
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amino acids before the next stop codon in the open reading frame, but 
this would increase the she ot 8" by 2130 Da, much larger than the 
observed mass of 8'l_. Treatment of 8* with calf intestinal and bacterial 
alkaline phosphatases did not effect the mobility of either 8's or 8'l 
5 suggesting that serine and threonine phosphorylation is not involved in 
the formation of 8'i_,; attachment of other groups remains a possibility. 
Hence, translational f ram eshif tihg (or jumping), covalent modification 
(Other than phosphate on Ser or Thr) and mRNA splicing remain possible. 
It seems most pertinent to consider translational frameshifting 
10 aS a source of 8'L since such a mechanism has precedent in holoenzyme 
Structure. The dnaX gene encoding the x subunit of the holoenzyme 
generates the 7 subunit by a translational frameshift into the -1 
reading frame. If 8'l is produced by a -1 frameshift, the frameshift 
would have to occur upstream of the holB stop codon but not so far 
1 5 upstream that a -1 frameshift would produce a truncated protein due to 
running into an early -1 frame stop codon. Thus the -1 frameshift 
would have to occur at or after the last -1 frame stop codon near 
GIU320 after which translation would proceed past the normal stop 
codon in the open reading frame to produce a protein which is 7 amino 
20 acids larger than that predicted by the open reading frame of holB. 

The y complex expends ATP energy to clamp the p subunit onto a 
primer and it is this p dimer clamp that tethers the ae polymerase to 
the template for rapid and highly processive DNA synthesis by the ae 
polymerase which is only efficient after the p subunit has been clamped 
25 onto the DNA by 7 complex action. A mixture of the ry and 8 subunits is 
sufficient in this assay to clamp p onto DNA, however much more y and 8 
Is heeded relative to the amount of y complex. The 8' subunit stimulates 
Y and 8 in this assay such that the amounts of y, 8 and 8' are nearly 
comparable with the amount of y complex that is required ( the X and y 
30 subunits give another 3-8 fold stimulation of activity at low 
concentrations of y8S\ as described in the accompanying report. 
Likewise, neither 8 or 8" have a large effect on the ATPase activity of y 
but addition of both 8 and 8' to y gives a 30-fold stimulation of the 
YAtPase activity. The requirement of both 8 and 8' for efficient 
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replication activity and for maximal ATPase activity of y correlates 
with the physical studies In the accompanying report which show that 8 
and 8' form a complex and the 88" complex binds tightly to y, whereas 
when 8 and 8' are added separately with y they do not form a strong y8 or 
5 yS' complex. 

The x subunitj cphtains the sequence of the y subunit (yls 
produced from t) plus, an extra domain of 212 amino acids Which binds 
to ot and to DNA. 

A homology search of the translated GenBank indicated that the 
1 0 most homologous protein to 8' of the present invention was another E. 
coli protein, the y/t subunit(s) of DNA polymerase III holoenzyme. There 
is 27% identity and 44% similarity including conservative substitutions 
over the entire length of 8' and y/t. One particular region in 8' of 50 
amino acids (amino acids 110-159) is strikingly similar to y/t (amino 
15 acids 121-170) having 49% identity. A putative helix-turn-helix motif 
in y/t (Alai14X3GlyH8X5Leui24) is positioned just 19 residues 
downstream of the helix-tufn-helix motif in 8'. 

The extent of sequence homology between 8' and the y/t subunitr 
is above the level required to speculate that they have similar three 
20 dimensional structures; When both 8's and 8'L are taken into account, 
four of the eleven subunits within the holoenzyme, according to the 
present invention, may have similar structures. 

The interactions between 8 and 8' were also studied as part of the 

present invention. 

2 5 Equal amounts of 8 and 8' were incubated together for 30 minutes 

at 15°C and then analyzed by gel filtration and glycerol gradient 
sedimentation. Gel filtration analysis showed 8 and 8' subunits 
comlgrate and elute approximately six-to-eight fractions earlier than 
either 8 or8' alone indicating that they form a 88' complex. Comparison 

3 0 with protein standards yields a Stokes radius of 31.1 A. The 8 and 8' 

also comigrated during glycerol gradient analysis and sedimented 
faster than either 8 or 8' alone, again consistent With formation of , a 88' 
complex with an S value of 3.9S. Combining the S value and Stokes 
radius yields a native mass of 53 kDa for the 88' complex, most 




consistent with the mass of a 1:1 complex of 818'1 (75.6 kDa) then of a 
higher order aggregate of 55'. Both 8'l and 8's are visible in the 88' 
complex indicating they are present as a mixture of 881 and 88 f s* 
Formation of a trimeric SS'LS's complex is unlikely as the combined 
5 mass would be 113 kDa, twice the observed mass. However, if free 8 
and 8' were in a rapicj equilibrium with the 88* complex then the 
observed mass of ihe ,88* complex would be a weighted average of the 
amount of complex and amount of free subunits and therefore the 
possibility of a higher order aggregate such as a SS'lS's complex can not 

10 be rigorously excluded. 

Densitometry analysis of the Coomassie Blue stained gel yielded 
a molar ratio of 8:8' of 1.1:1.0, respectively ( the two 8' bands were 
considered together as one 8*) further supporting the 8i8'1 composition. 
Different proteins may take up different amounts of Coomassie Blue 

1 5 stain and therefore molar rates determined by densitometry must be 
regarded as tentative. A dynamic light scattering analysis of 8, 8' and 
88 f complex is also presented in the table below. 

The Stokes radius and sedimentation coefficient of 8, 8' and 88' 
complex were determined from the gel filtration and glycerol gradient 

20 sedimentation analyses; and the native molecular mass and the 

frictionai coefficient were calculated from the Stokes radius and S 
value. These calculations require the partial specific volume of 8 and 
8'; these volumes were calculated by summation of the partial specific 
volumes of the individual amino acids for each 8 and 8\ Molecular 

25 weights of 8, 8' and the 88' complex (assuming a composition of 518'1) 
were calculated from the gene sequences of 8 and 8\ 
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58* 


Stokes radius 


26.5 


25.8 


31.1 


Sedimentation coefficient 


3.0 


3.0 


3.9 


Partial specific volume 


0.74 


0.74 


0.74 


Native mass (radius and S value) 


34,708 


33,791 


52,952 


Native mass (gene sequence) 


38,704 


36 r 934 


75,630 


Frictionai coefficient 


1.22 


1.20 


1.25 


Diffusion coefficient (light scattering) 


7.60 


8.16 


6.61 


Radius calculated (D) 


28.2 


26.3 


32.5 



The diffusion coefficient obtained from the light scattering 
analysis can be used to calculate the Stokes radius and these values 
were within 6% of the Stokes radius of 5, 8* and 58' complex determined . 
In gel filtration. 

5 In the y complex, the y, 5 and 5' subunits are bound together along 

with the x and V subunits. The activity analysis described herein 
indicates that y and 5, interact since both are necessary and sufficient 
to assemble the p clarnp onto DNA. Further, the 8' subunit stimulates 
the DNA dependent ATPase activity of y indicating that y and 8' Interact. 

1 0 The physical interaction between 8, 8' and y were examined using 

the gel filtration techhique which detects tightly bound protein-protein 
complexex, but since components are not at equilibrium during gel 
filtration, weak protein complexes will dissociate. They subunit 
(47kda) is larger than 5 and 8', and is at least a dimer in its native state 

1 5 with a large Stokes radius and quite an asymmetric shape (y runs as a 

trimer or tetramer in gel filtration and as a dimer in a glycerol 
gradient. The y was mixed with a 4-fold molar excess of 8 and 8' then 
gel filtered. A complex of y88' was formed as indicated by the 
comigration of both the 8 and 8' subunits with y. The excess 88' complex 
20 eluted much later (fraction 40-46). Since 8 binds 8', it is possible that 
only one, for example .8, binds y and the other (eg. 8') is part of the 
complex by virtue of binding 8 instead of directly interacting with y. To 
determine which subunit, 5 or 8", binds directly to y. the y subunit Was 
mixed with either 8 or 8' then gel filtered. The mixture of y and 8 

2 5 showed that y and 8 did not form a gel filterable yS complex as Indicated 

by the absence of 8 in fractions 24-32 containing y. The mixture of y . 
and 8" showed that 5' did not form a complex with y either as indicated 
by the absence of 8' in fractions containing y. Therefore both 8 and 8' 
must be present to form a gel filterable complex With y. Using pure 

3 0 clohed 8 no y8 complex in gel filtration (or in glycerol gradient analysis) 

Was seen. 

The gel filtration column fractions of the y88' complex were 
analyzed for their activity in assembly of the p clamp on primed DNA. 
Fractions containing the ySV complex were quite active. The 88' 



complex, even at high concentration, is not active in assembly of the p 
clamp and therefore the slight amount of activity in following fractions 
wag probably due to a slight amount of y which trailed into the peak of 
the 68' complex thus giving activity in the assay. The column fractions 
5 of the yS and yS' mixtures were inactive except for the peak fraction of y 
th the y8' ananlysi yj\)\ch supported weak activity, there was a slight, 
b&rely detectable ampUnt of 8' (but not 8), in the fractions containing y 
as though a slight amduht of y8' complex Was formed and survived the 
column. 

1 0 Following these studies with 6 and 8\ the present invention has 

found that 8 behaved as a monomer in gel filtration and glycerol 
gradient sedimentation. The 8* subunit also appeared mohomeric. 
Neither 8 or 8\ when separate, formed a gel filterable complex with the 
y subunit. Yet they most likely bind to y (at least Weakly) as indicated 

1 5 by activity assays in Which y8 is active (without 8') in assembly of the 
p clamp, and 8' (Without 8) stimulates the DNA dependent ATPase 
activity of y. The 8 and 8' subunits bound each other to form a gel 
filterable i:1 8i8't complex and when mixed with y they efficiently 
formed a tight gel filterable y88' complex. Hence, the binding of 8 and 8' 

20 to y is cooperative. 

The 8' subunit is a mixture of two related proteins, 8*L and 8's 
Which are encoded by the same gene; S'l is 521 da larger than the gene 
sequence predicts. The functional and structural difference between 
them is presently unknown. In these binding studies, both 8's and 8'l 

25 bound to 8 and they both assembled into the y88' and t88' complexes, 
Consistent With the fact that both 8'l and 8's are observed withion 
poilll and the y complex. 

No single subunit of the y complex is active in assembling the p 
damp on DNA. Presumably this reacton is to complicated for just one 

3 0 protein. A mixture of y and 8 is capable of assembling p onto DNA 
although they are inefficient and require 8' for efficient activity. 
Perhaps 8' increases the efficiency of y8 by physically bringing y and 8 
together in the yS8' complex, although it is also possible that 8' 
participates directly in the chemistry of the reaction. The y subunit has 
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a low level of DNA dependent ATPase activity, and described above, 8 
binds the p subunit. These two facts allow speculation that y binds the 
primed template, and 5 brings in the p subunit, then ATP hydrolysis is 
coupled to assemble the ring shaped p dimer around the DNA. 

5 Since y is known to bind ATP and has a low level of DNA 

dependent ATPase activity, it is an obvious candidate as the subunit 
Which interacts with the ATP in the p clamp assembly reaction. Two 
molecules of ATP are' hydrolyzed in the initiation reaction in which the 
holoenzyme becomes clamped onto a primed template to form the 

10 initiation complex. This initiation reaction has its basis in the 
assembly of the p clamp on DNA. The stoichiometry of two ATP 
hydrolyzed in formation of one initiation complex suggests two 
proteins hydrolyze ATP. These two proteins may be the two halves of a 
y dimer. However it is also possible that 8 interacts with ATP. The 

1 5 sequence of 8 shows a very close .match to the consensus for an ATP 
binding site and UV induced cross-linking studies suggest that 8 binds 
ATP. The availability of 8 in quantity should now make possible a full 
description of the mechanism by which ATP is couplked to assemble the 
ring shaped p dimer around DNA. 

20 The third subunit according to the present invention, that of 8, 

was also identified, purified, cloned and sequenced. N-terminal 
analysis of the 8 peptide yielded the following sequence of 40 amino 

acids: 

Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 
25 '5 10 15 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 
20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val 
. 35 40 

3 0 Based upon this sequence, two DNA probes Were fashioned. These 

probes had the sequences of: 

ATG CTG AAA AAC CTG GCT AAA CTG GAT CAG ACT GAA ATG GAT AAA 45 
GTT AAC GTT GAT 57; and 

CTG GCT GCT GCT GGT GTT GCT TTT AAG GAA CGT TAT AAC ATG CCG 45 
3 5 GTT ATT GCT GAA 57. 
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These two probes were also end-labelled with 32 P for use with 
Southern blot procedures. 

For Southern blot analysis, E. coli DNA Was cut with the 8 Kohara 
map enzymes [see Cell 50:495 (1987)]. The two probes described above 

5 Were used to probe two Southern blots of E. coli DNA. The bands (DNA 
fragments) in commpn with the two blots were noted, as was their size. 
At least 3 positions on, the Kohara map of the £ coli chromosome were 
consistent with the Southern blot fragmentation pattern. 

Thus, based upon these findings, E. coli Dna digested With either 

10 EcoRV or Pvull following DNA extraction [see J.M.B. 3:208 (1961)] was 
run out in an agarose gel, and all the DNA In the size region of the gel 
cbrresoponding to the fragment size containing 6 for that enzyme (Pvull 
or EcoRV) from the Southern blot analysis, was extracted from the gel 
and cloned into Ml3mp18 and M3mp19 using conventional techniques. 

15 The M13 transformant DNAs were analyzed by Southern blot and probed 
Usihg the two probes described above. One M13 DNA was obtained With 
the 6 sequence. When this M13 e was sequenced, however, not all the 
theta gene was present; the gene extended beyond the Pvull restriction 
site. The M13 9 was then used as a reagent to obtain the complete e 

20 gene. 

A Kohara X phage (X336) was grown and the 9 gene in E. coli was 
excised using an EcoRV cut 2.7 kb fragment. Next, a filter containing 
all the Kohara X phage was probed using the partial 9 gene as the probe. 
Thus, it was possible to identify the X phage containing the full 9 gene. 
25 The holE gene Was then cloned from the X phage into pUC18 and 

subsequently sequenced. The full genetic sequence for the 9 gene was 
thus determined to be: 

ATG CTG AAG AAT CTG GCT AAA. CTG GAT CAA ACA GAA ATG 39 
fflT AAA GTG AAT GTC GAT TTG GOG GH G GCC GGG GTG GCA 78 
3 0 TTT AAA GAA CGC TAG AAT ATG CCG G TG ATC GCT GAA GCG 117 
SH GAA CGT GAA CAG OCT GAA CAT TTG CGC AGC TGG TTT 156 
CGC GAG CGG CTT ATT GCC CAC CGT TTG OCT TCG GTC AAT 195 
CTG TCA CGT TTA CCT TAC GAG CCC AAA CTT AAA 228 
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The open reading frame above predicts that 0 is a 76 amino acid 
protein of 8,629 Da. The underlined nucleotide sequence exactly 
matches the corresponding N-terminal sequence of 9. In addition, the 
upstream sequence contains two putative RNA polymerase promoter 
5 signals and a Shine-patgarno sequence. This upstream sequence is: 
AG GCGTAGCGAA GGGAGCGTGC AGTTGAAGCC ATATTATCTA TTCCITI'ITG 52 
TAATAACTTT THACA GAGG ATAACCTTCQLC^^ AGTCGAGGAT 102 

CATCAATTCC GGCTTGCCAT CCTGGCTCAC TCTTAGTAAC TTTTGCCCGC 152 
GAATGATQMi^GATERAGA 172 
1 0 The downstream sequence begin with a stop codon: 

TAA AACTTATAC AGAGTTACAC TTTCTTACAT AACGCCTGCT AAATTATGAG 52 - 
TATTTTCTAA ACCGCACTCA TAATTTGCAG TCATTTTGAA AAGGAAGTCA 102 
TTATG 107 

This translated into the peptide sequence: 

1 5 Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

5 10 15 y 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 
20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val Glu Arg Glu Gin Pro 
20 35 40 45 

Glu His Leu Arg Ser Trp Phe Arg Glu Arg Leu He Ala His Arg 
50 55 60 

Leu Ala Ser Val Asn Leu Ser Arg Leu Pro Tyr Glu Pro Lys Leu 
65 70 75 

2 5 lys 

76 

Using site-directed mutagenesis, the initial Met codon (AGA ATG) 
was mutated to CAT ATG (Ndel site) using an oligonucleotide with 15 
bases on either side of the mutation. This was then used to obtain the 

3 0 overproduction of the 6 gene in which the mp199 (a 2700 bp insert) was 

grown in strain CJ236 cells in the presence of uridine. The purified 
Single stranded DNA from these cells was purified and hybridized with 
the Ndel mutation and implicated with the holoenzyme in vitro. XLI-Blue 
cells were transformed with the double stranded DNA product and ten 
3 5 plaques were selected for miniprep sequencing; all 10 plaques 

contained the mutation. The 0 sequence was excised from the DNA with 
Hindtll, Ndel, and the resulting 1 kbp fragment was inserted into pET-3C 
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[see Methods in Enzymotogy 185:60 (1990)]. The resulting pET-3Ce was 
used to transform competent cells [BL21(DE3)]. Single colonies of the 
transformed cells Were grown in liquid media at 37° C to an OD of about 
0.6, Induced with IPTG generally as described previously, and harvested 
5 post Induction. Successful overexpression of the 6 peptide was 
obtained using this (System. 

The N-terminal sequence analysis of 8 was examined as follows: 
Poilll was purified (see J. Biol. Chem. 263:6570 (1988)] except that the 
last step using Seperose 6 was replaced with an ATP-agarose column 

1 0 (Pharmacia, type II) Which was eluted with a linear salt gradient. After 
the 88' etuted from the ATP agarose column, a mixture of pure pollir and 
YXV complex eluted together. This mixture was separated by column 
chromotography on MohoQ using a linear gradient of 0-0.4 M NaCI in 
buffer A. The pollir which was eluted after the yxv complex was used 

15 as the source of 9 subunit. The subunit was separated from a, % and e 
subunits of pollir by electrophoresis in a 15% SDS polyacrylamide gel, 
and was electroblotted (110 pmol) onto PVDF membrane. The 8 subunit 
was visualized by Ponceau S stain, and the N-termlnal sequence was 
determined to be: 

20 NH2*^fet Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

.5 10 15 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Ala Tyr 

20 25 30 

Asn Met Pro Val He Ala Glu (Ala) (Val) 

25 35 

In Which the parenthesis indicate uncertain amino acid assignments. 

The 8 was isolated using E. coli genomic DNA isolated from strain 
C600 [see J. Mol. Biol. 3:208 (1961)], cut with the Kohara panel of 
restriction enzymes (BamHI, Hindill, EcoRI, EcoRV, Bgll, Kpnl, Pstl and 

3 0 Pvull), and separated in a 0,8% native agarose gel. The gel was 
depurinated (0.25 M HCI), denatured (0.5 M NaOH, 1.5 M NaCI) and 
neutralized (1 M Tris, 2 M NaCI, pH 5.0) prior to transfer of the DNA to 
Geftescreen plus (DuPont New England Nuclear) using a Vacugene 
(Pharmacia) apparatus in the presence of 2xSSC {0.3 M NaCI, 0.3 M 

3 5 sodium citrate, pH 7.0). The membrane was air dried prior to 

hybridization. Two synthetic oligonucleotide DNA 57-mer probes were 



designed based on the N-terminal sequence of 6 assuming the highest 
frequency of codon usage and favoring T over C in the wobble position, 
the two probes (5'->3') were: 
Theta 1 (codonS 1-19): 
5 ATG CTG MA MC CTG GCT AM CTG GAT CAG ACT GM ATG GAT 42 
AM GTT MC GTT GAT 57; aid 
Theta 2 (codons 20-38): 
CTG GCT GCT GCT GGT GTT GCT TTT AM GM CGT TAT MC ATG 42 
CCG GTT ATT GCT GM 57 . 
10 The DNA 57'mer8 (100 pmol each) were 5' end-labelied using 1 

[y-32p] ATP and T4 polynucleotide kinase, and then used to probe 
Southern blot of the restricted E. coli genomic DNA. Two Southern blots 
were hybridized individually using one or the other of the 57-mer 
probes overnight in the same buffer as above except with an additional 
1 5 200 ltg/m! of denatured salmon sperm DNA. The Southerns were washed 
irt 2XSDS at room temperature for 30 minutes, then 3 hours at 42°C 
(changing the buffer each hour), then exposed to X-ray film. The Theta 1 
probe showed a single, bnd in 7 of the 8 restriction digests; the Theta 2 
probe consistently showed many bands in each lane which were 
20 eliminated equally as the hybridization and washing conditions were 
gradually increased in stringency, suggesting that Theta 2 did not 
match the true sequence of the holE gene. After holE was cloned and 
seqeunced, it was found that 7 nucleotides of Theta 1 and 12 
nucleotides of Theta 2 did match the holE sequence. 
25 To clone the holE gene, 100 ug of E. coli DNA was digested with 

Pvull, and the small population of DNA fragments migrating in the 400 
to 600 bp range (the Southern blot using Theta 1 probe indicated holE 
was on a 500 bp Pvull fragment) was extracted .from the agarose gel, 
blunt-end ligated into Ml3mp18 digested with Hindi, and transformed 
30 into competent XL1-6lue cells. Presence of the holE gene was 

determined by Southern blot analysis of minilysate DNA prepared from 
recombinant colonies using the 5' end-labelled Theta 1 as a probe. One 
positive clone was obtained and sequenced; it contained approximately 
one-half of the holE gene (a Pvull site lies in the middle of holE). This 



fragment of holE was uniformly labelled using the random primer 
labelling method, and used to screen the complete Kohara ordered 
lambda phage library of E. coli chromosomal DNA transferred onto a 
riylon membrane. Prehybridization and hybridization were conducted as 
described above except that the temperature was Increased to 65°C and 
the Wash steps werq rpore stringent (2XSSC, 0.2% SDS ,next 1xSSC, and 
then O.BxSSC at 65°C) t . A single phage clone X 19H3 (336 of the 
miniSet) [see Cekk 50:495 (1987)] hybridized with both the genomic 
fragment and the Theta 1 probe. 

The phage and a 2.7 kb EcoRV fragment containing the 9 gene was 
excised, purified from a native agarose gel, and blunt-end ligated into 
th& Hinclll site of Ml3mpl9 to yield M13mp19-9. The 2.7 kb EcoFM- 
Hindlll fragment from Ml3mp19-9was excised, gel purifeid, and 
directionally ligated into the corresponding sites of pUC18 to generate 
pUC-9. Both strands of the holE gene in pUC-9 were sequenced using the 
sequenase kit [a- 35 S]dATP, and synthetic DNA 20-mers, This time the 
entire holE gene was present. 

An Ndel site was generated at the start codon of the holE gene by 
the oligonucleotide site directed mutagenesis using a DNA 33-mer : 

ATGATG&GGA. GATTACOTATJ^CTGAAGAA^ CTG 33 
containing an Ndel site (underlined) at the start codon of holE to prime 
replication of M13mpt9-9 viral ssDNA and using SSB and DNA 
polymerase (II holoenzyme in place of DNA polymerase I. The Ndel site 
in the resultant phage (Mi3mp19-9-NdeI) was verified by DNA 
sequencing. An approximately 1 kb Ndel-Hindlil fragment was excised 
from Ml3mp19-9-Ndel and directionally ligated into the corresponding 
sites of pUCi8 to yield pUC-9-Ndel. A 1 kb Ndel/BamHI fragment from 
pUC-e-Ndel was then subcloned directionally into pET3c digested with 
both Ndel and BamHI to generate the overproducing plasmid, pET-9. 

Reconstitution assays contained 72 ng phage X 174 ssDNA (0.04 
pmol as circles) uniquely primed with a DNA 30-mer, 0.98 SSB (13.6 
pmot as tetramer) 10 ng B (0.13 pmol as dimer), and 4 ng y coniplex 
(0.02 pmol) in 20 mM Tris-HCI (pH 7.5), 8 mM MgCl2, 5 mM DTT, 4% 
glycerol, 40 \xg/m\ BSA, 0.5 mM ATP, 60 \iM dGTP, and 0.1 mM EDTA in a 



final volume of 25 u.1 (after addition of oce or ae9). The ae and aE9 
complexes were each preformed upon mixing 38 pmol each of a and e, 
and when present, 152 pmol of 9 in 12.5 \i\ of 25 mM Tris-HCI (pH 7.5), 
2 mM DTT, 1 mM EDTA, 10% glycerol followed by incubation for 1 hour 

5 at 15°C. These protein complexes were diluted 30-fold iri the same 
buffer just prior to addition to the assay on ice, then the assay tube 
was shifted to 37°C for 6 minutes to allow reconstruction of the 
processive polymerase ori the primed ssDNA. DNA synthesis was 
Initiated upon rapid addition of 60 nM dATP and 20 \iM ta- 32 PjTTP, then 

10 quenched after 15 seconds and quantitated using DE81 paper. Proteins 

used in the resconstruction assays were purified, and their 

concentrations determined using BSA as a standard. 

The following synthetic DNA 56-mer was designed as a hooked 

primer template to assay 3'->5" exonuclease activity: 

15 T ' , 

T TCGGCTrAAGGAG-3' 

T T(kxjGAATIO^TC(^ ' 

This DNA 56-mer (75 pmol as 56-mer) was 3' end-labelled with 
20 75 pmol of [a- 32 P] dTTP (3000 Ci/mmol) using 200 Units of terminal 
transferase under conditions specified by the manufacturer (Boehrnger) 
in a total volume of 100 ul followed by spin dialysis to remove 
remaining free nucleotide. 

Prior to adding proteins to the assay, 8 was titrated into e upon 

25 incubating 2 jag e (70 pmol as monomer) with G (6-10 ug, 0-1.16 nmol 
as monomer) in a total volume of 10 \l\ buffer A containing; 50 jig/ml 
BSA at 15°C for 1 hour. The e8 mixture was then diluted 100-fold using 
buffer A containing 50 ug/ml BSA. A 2.5 *il sample of diluted complex 
was added to 200 fmol 3'- 32 P-end-labelled mispaired hook DNA in 12.5 

30 uJ of 25 mM Tris-HCI (pH 7.5), 4% sucrose, 5 mM MdCfc, 8 mM DTT, and 
50 jig/ml BSA followed by a 3 minute incubation at 15°C. The reaction 
was quenched upon spotting 13 \i\ of the mixture onto a DE81 filter. 
The amount of mispaired nucleotide remaining Was quantitated^ and 
subtracted from the total mispaired template added to obtain the 

3 5 amount of 3' mispaired nucleotide released. 
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Gel filtration was performed using HR 10/30 fast protein liquid 
chromatography Columns, Superdex 75 and SuperoSe 12, in buffer C. 
Samples containing either 9 t e or a alone, and mixtures of these 
subunits were incubated at 15°C for 1 hour. The entire sample was then 
5 injected onto the column and after collection the first 5.6 ml (Superose 
75) or 6.0 ml (supefose 12), fractions of 160 Ml Were collected and 
analyzed in 15% SbS.polyacrylamide gels. Protein standards Were a 
mixture of proteins of khown Stokes radius and were also analyzed. 
Densitometry of stained gels was performed using a laser 

1 0 densitometer, Ultrascan XL (Pharmacia-LKB). 

Subunits (a* 9, e) alone and mixtures of these subunits were 
incubated 1 hour at 15°C (with 5% glycerol), then mixed with protein 
standards of known S value (50 \ig of each protein standard) and 
immediately layered onto 12.3 ml linear 10%-30% glycerol gradients in 

15 25 fnM Tris-HCI (pH 7.5), 0.1 M NaCI, 1 mM EDTA. The gradients were 
centrifuged at 270,000 x g for 44 hours (e, 9, and e9 complex) or 26 
hours (ae and ae9) at 4°C. Fractions of 150 ^l were collected from the 
bottom of the tube and, analyzed in a 15% SDS-polyacrylamide gel 
stained with Coomassie Blue. 

20 In summary* the sequence of the N-terminal 40 amino acids of 9 

were obtained from the 9 subunit within the polllP subassembly (ae9x) 
of holoenzyme. This sequence did not match any previously identified in 
GenBank, and therefore the invention attempted to identify the holE 
gene using the Kohara restriction map of the E. coli chromosome. Two 

2 5 57-mer DNA probes were made based on the N-terminal amino acid 

sequence of 9 and were used in a Southern analysis of £. coli genomic 
DNA digested with the eight Kohara restriction map enzymes. One of 
the 57-mer probes hybridized to a single band in 7 of 8 bands obtained 
upon Southern analysis, indicating that these 7 fragments must overlap 

3 0 in the holE gene. The Kohara restriction map was searched, and four 

hear matches were located. Since which of these positions could not be 
distinguished in the Kohara map as the true holE gene, the small 500 bp 
Pvult fragment from genomic DNA was directly cloned into Ml3mp18. 
The DNA sequence of this Pvull fragment predicted an amino acid 
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sequence which matched exactly to the 40 residue N-terminal sequence 
of e. However, this was only a partial clone of holE due to an internal 
Pvull site. The Pvull fragment and one of the synthetic 57-mers were 
subsequently used to probe the entire Kohara library of overlapping \ 
5 phage on one membrane which identified the location of holE within K 
19H3 (No. 336 of th|e s miniset). 

The Kohara restriction map of the chromosome in the vicinity of 
X 19H3 shows a close' match to the fragment sizes obtained from the 
Southern analysis. The overlapping fragments Identify the position of 
1 0 holE at 40.4 minutes on the E. coli chromosome. DNA analysis showed 
two Bgll sites separated by 122 bp that span the Theta 1 57-mer probe, 
thus explainihg the absence of a Bgll fragment in the Southern analysis 
in Which a small fragment would have run off the end of the gel. This 
small fragment would also have been missed in the procedure used by 
1 5 Kohara, accounting for the single Bgll site shown on the map. 

A 2.7 kb EcoRV fragment was subcloned from X 19H3 into 
M13mp18 and the holE gene was sequenced. The DNA predicts e is a 76 
amino acid protein of 8,647 Da, slightly smaller than the 10 kDa 
estimated from the mobility of 9 in a SDS-polyacrylamide gel. The pi of 
20 8 based on the amino acid composition is 9.79, suggesting it is basic, 
consistent with its ability to bind to phosphocellulose, but not to Q 
Sepharose. The molar extinction coefficient of 9 at 280 nm calculated 
from its single Trp and the two Tyr residues is 8,250 fvHcm- 1 . 

Site directed mutagenesis was performed on the holE gene cloned 
25 into Ml3mpl8to create an Ndel site at the initiator methionine. The 
holE gene was excised from the site mutated Ml3mp18, inserted into 
pUd8 (in order to use a convenient BamHI site), then a 1 kb Ndel-BamHI 
fragment containing holE was ligated directionaljy into the Ndel and 
BamHI sites of pET3c to yield the pET-8 overproducing plasmld in which 
3 0 holE expression is driven by T7 RNA polymerase. The pET-8 was 
introduced into BL21(DE3) cells and upon induction of T7 RNA 
pblyrherase by IPTG, 9 was expressed to 63% of total cell protein. The 
induced subunit was freely soluble upon cell lysis and its purification 
was relatively straight-forward. Four liters of cells were lysed and 



300 mg of pure 9 wa9 obtained in 78% overall yield after column 
chromatography on Q sepharose, heparin agarose, and phosphocellulose. 

Specifically, the purification of 9 Was carried out by utilizing 
four liters of BL2KDE3) cells harboring the pET-8 expression plasmid 
5 Were grown in 4 L of LB media containing 50 |il.ml carbehicillin. Upon 
growth to an ODeoo/otf 0-6, IPTG was added to 0.4 mM and the cells were 
iricubated at 37°C for ,2 hours further before they were harvested by 
centfifugatlori (8.4 g Wet weight) at 4°C, resuspended in 15 ml of cold 
SO WM Tris-HCI (pH 7.5) and 10% sucrose, and stored at -70°C. the 
10 cells Were thawed and lysed by heat lysis. The cell lysate (Fraction I, 
20 ml, 880 mg) was dialyzed (all procedures were performed at 4 G C) for 
2 hours against 2 L of buffer A, and then diluted 2-fold with buffer A to 
a conductivity equal to 50 mM NaCI. The lysate Was then applied to a 55 
mi Q sepharose fast flow column equilibrated in buffer A. The 8 flowed 
1 5 through the column as analyzed by a Coomassie Blue stained 15% SDS 
polyacrylamide gel and confirmed by the stimulation of the e 
exonuclease activity assay developed for 9. The Q sepharose flow 
through fraction (Fraction II, 81 ml, 543 mg) was then applied to a 50 
ml column of heparin agarose (BioRad) which Was equilibrated in buffer 
20 A containing 50 mM NaCI. The flow through fraction containing 9 was 
approximately 95% pure 9(Fraction III, 110, 464 mg), and was dialyzed 
overnight against 2 L buffer B, then applied to a 40 ml phosphocellulose 
column (P11, Whatman) equilibrated in buffer B. The column Was 
washed with buffer B and 8 was eluted using a 400 ml linear gradient of 
25 10 mM to 200 mM sodium phosphate (pH 6.5) in buffer B. Eighty 

fractions were collected and analyzed for 8. Fractions 42-56 were 
pooled (Fraction IV, 68 rhl, 300 mg) and dialyzed against 2 L buffer A 
prior to aliquoting and storage at -70°C. The protein concentration was 
determined using BSA as a standard. Concentration of pure 8 
3 0 determined by absorbance at 280 nm using e280 at 8,250 M- 1 cnr 1 was 
90% of the protein concentration. 
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10ne unit is defined as the increase in fmol nucleotide released per 

minute relative to the same reaction with no 9 added (e alone). 

Throughout this description of the present invention, buffer A 

Was 20 mM Tris-HCI (pH 7.5), 10% glycerol, 0.5 mM EDTA, and 2 mM DTT; 

Buffer B was 10 mM NaP04 (pH 6.5), 10% glycerol, 0.5 mM EDTA, and 2 

mM DTT; and Buffer C was 25 mM Tris-HCI (pH 7.5), 10% glycerol, 1 mM 

EDTA and 100 mM NaCl. 

Studies of the purified cloned 9 showed it had the same amino 

terminal sequence as predicted by holE (and 6 within pollll' used for 
electroblotting), proving that the it was indeed the purified protein 
encoded by the cloned gene. The activity of 9 (stimulation of e) co- 
purified with 9 throughout the preparation. 

In searching for activity, the subunit was tested for polymerase 
activity and for endonuclease, 3'->5' exonuclease and 5'->3* exonuclease 
activities on ssDNA and dsDNA. However, no such activities were 
observed. 

Since 9 is one of the subunits of pollll core, it was examined for 
any effect it might exert on the DNA polymerase and 3'->5* exonuclease 
activities of a and e. Previous work compared the ability of ae and 
pollll core to form the rapid and processive polymerase with 
holbenzyme accessory' proteins, but there was no significant difference 
between ae and the pollll core (ae9 ) suggesting 9 had no role in the 
speed and processivity of synthesis. With pure 9 , assays could be 
performed by either adding 9 to ae or omitting 9. In a comparison of the 
efficiency of ae complex and ae9 complex in their ability to reconstitute 
the rapid processive polymerase with accessory proteins, the ae (or 



aee) was mixed with the 7 complex and B subunit in the presence of ATP 
and phage X174 ssDNA primed with a synthetic oligonucleotide and 
"coated" with SSB. The mixture was preincubated for 6 minutes at 37°C 
to allow the y complex time to transfer the B ring to DNA forming the 

5 preinitiation complex clamp and time for the polymerase to associate 
With the preinitiation complex. The rapid processive" polymerase can 
fully replicate this template (5.4 kb) within 12 seconds. Replication 
Was then initiated by the addition of dATP and [a- 32 P]TTP, which were 
ornitted from the preincubation, and the reaction was terminated after 

10 15 seconds. In this assay, the effect of eon the amount of DNA 

synthesis will be a reflection of either the speed or processivity of 1he 
polymerase or the binding efficiency of the polymerase to the 
prdirtltiation complex. Based on a previous comparison of ae and core, 6 
Was not expected to influence the speed or processivity of DNA 

1 5 synthesis. However, in the prior study, the relative affinity of ae and 

pollll core for the preinitiation complex was not examined. 

The ae and ae6 were titrated into this reconstitution assay and 
the results indicate that 9 had little influence in the assay. Therefore, 
8* does not significantly increase the affinity of ae tor the preinitiation 
20 complex. These results are also consistent with prior conclusions. The 
accessory protein preinitiation complex greatly stimulates the activity 
of the a subunit (without e) in the reconstitution assay. However, this 
"a holoenzyme" was half as fast as the "ae holoenzyme" and is only 
processive for 1-3 kb. The ability of e to stimulate this "a holoenzyme" 

2 5 Was tested in the absence of e, but the 6 subunit had no effect 

indicating that it did not increase the speed or processivity of the "a 

holoenzyme" either. 

9 was next examined for an effect on the 3'->5" exonuclease 
activity of e using a synthetic "hooked" primer template with a 3" 
30 terminal G-T mispair. A slight (3-fold), but reproducible stimulation of 
6 on excision of the 3' mismatched T residue by e Was observed. In the 
absence of e, addition of up to 1.0 ug of e released no 3' terminal 
nucleotide. These results are compatible with art earlier study 
comparing 3' excision rates of pollll core and ae complex in which the 



pollil core was approximately 3-fold faster than ae. Although a 3-fqld 
effect is not dramatic and may not be the true intracellular role of 9 , It 
is large enough to follow 8 through the purification procedure. The 
stimulation of e exonuclease activity co-purified With 0 throughout the 
5 purification procedure artd the overall activity was recovered in high 

yield. i . 

The pollil core subassembly of the holoenzyme consists of three 
subunits: 9 , a (polymerase), and e (3'->5" exonuclease). Gel filtration 
Was used to analyze the ability of these individual subunits according 

1 0 to the present invention to assemble into the pollil core assembly, a 
and 9 were mixed together and gel filtered; however, 9 did not 
comigrate with a. Upon mixing e and 9, a stable e9 complex was formed. 
The results of these studies are quite consistent with the activity 
analysis presented above in which 6 had no effect on the polymerase but 

i 5 a noticeable effect on the activity of e. 

It has been reported that a concentrated preparation of pollil 
core (18 uM) was dimeric containing two molecules of pollil core which 
were presumed to be dimerized through interaction between their 9 
subunits since a concentrated solution of ae complex contained only one 

20 a and one e. However, in the gel filtration experiments of the present 
invention, the reconstituted pollil core migrates only slightly larger 
than the a subunit indicating that 9 did not act as an agent of pollil core 
dimerization. 

In gel filtration experiments performed at a concentration of 73 
25 p.M ct and 73 e in either the absence of 9 (ae only), the presence of a 
substoichiometric amount of 8 (molar ratio <x:e:8 of 1:1:0.5), or With 
excess 8 (molar ratio 1:1:3), showed that the presence of 9 did not 
increase the aggregation state (i.e., monomer to .dimer). Thus, it may be 
considered that the ae complex by itself is a dimer. However, 
30 comparison of ae and pollil core with size standards in the gel 

filtration analysis show that they elute near the 158 kDa IgG standard 
Indicating that they are monomeric, i.e. one of each in the complex. 
They have a Stokes radius of 49A which is substantially the radius 
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determined for the ae- complex (50A), and similar to the 54A Stokes 
radius determined in studies of the dilute monomelic pollll core. 

To increase confidence in the aggregation state of these 
reconstituted complexes, the study of the ae complex and reconstituted 
5 polllt core was extended to an analysis of their sedimentation behavior 
In glycerol gradiehts using the same concentration and ratio of subuhlts 
as in the gel studies. Again the ae and ae9 essentially co-sedimented 
feg&fdless of whether 9 Was present. The ae complex and pollll core 
each sedimented with an S value close to that of the 150 kDa IgG size 
1 0 standard further indicating they are monomeric subassemblies. 

The native molecular weights of 8 t e and of the et complex were 
also determined using gel filtration and glycerol gradient 
sedimentation. The 8 and e subunits were first analyzed separately: 8 , 
by itself, elutes after myoglobin which is 17.5 kDa, indicating 8 is a 

1 5 monomer (8.6 kDa) rather than a dimer of 17.2 kDa; e migrated just 

aftef an ovalbumin standard (43.5 kDa) consistent With e as a 28,5 , 
monomer rather than a 57 kDa dimer. 

To asses the native molecular masses of 8 , e and the e8 complex, 
the analysis was extended to sedimentation in glycerol gradients. The 
20 Stokes radius and S values of 8, e and e9 complex were determined by 
comparison to protein standards and their observed mass Was 
calculated. The observed masses of 8, e and e8 are 1 1 .6 kDa, 32.7 kDa 
and 35.5 kDa, respectively, values most consistent with 8 as a 8.6 kDa 
monomer, e as a 28.5 kDa monomer, and the e8 complex having a 

2 5 composition of ei9 1 (37.1 kDa); densitometric analysis of the e8 

complex yielded a molar ration of 1 mol of e to 0.8 mol 8, consistent 
With this composition. 

The fourth subunit according to the present invention, that of ¥, 
was also identified, purified, cloned and sequenced. N-terminal 

3 0 analysis of the peptide yielded a protein which, when translated to 

its genetic sequence was found to be identical to a portion of a much 
Iarg6f sequence described by Yoshikawa [see Mol. Gen. Genet. 209:481 
(1987)]. However, Yoshikawa's description was for a riml sequence 
from E coli responsible for encoding an enzyme catalyzing acetylation 



of the N-terminal portion of ribosoma! protein S-18; his upstream 
sequencing from this gene's reading frame was purely accidental and he 
does not indicate any appreciation of the gene as a coding sequence for 
the ¥ peptide. 

5 The amino acid sequence obtained from the ¥ peptide is: 

Met Thr Ser Arg Arg Asp Trp Gin Leu Gin Gin Leu Gly He Thr 

5 10 15 

Gin Trp Ser Leu Arg'. Arg Pro Gly Ala Leu Gin Gly Glu He Ala 
20 25 30 

1 0 He Ala He Pro Ala His Val Arg Leu Val Mat Val Ala Asn Asp 

35 40 45 

Leu Pro Ala Leu Thr Asp Pro Leu Val Ser Asp Val Leu arg Ala 
50 55 60 

Leu Thr Val Ser Pro Asp Gin Val Leu Gin Leu Thr Pro Glu Lys 

1 5 65 70 75 

He Ala Met Leu Pro Gin Gly Ser His Cys Asn Ser Trp Arg Leu 
80 • 85 90 

Glv Thr Asp Glu Pro Leu Ser Leu Glu Gly Ala Gin Val Ala Ser 
95 100 105 

20 Pro Ala Leu Thr Asp Leu Arg Ala Asn Pro Thr Ala Arg Ala Ala 

110 115 120 

Leu Trp Gin Gin He 'Cys Thr Tyr Glu His Asp Phe Phe Pro Gly 
125 130 135 

Asn Asp 

2 5 137 

Using the information above, the sequence was translated into 
the genomic structure which is: 

ATG ACA TCC CGA CGA GAC TGG CAG TTA CAG CAA CTG GGC 39 
ATT ACC CAG TGG TCG CTG CGT CGC CCT GGC GCG TTG CAG 78 

3 0 GGC GAG ATT GGC ATT GCG ATC CCG GCA CAC GTC CGT CTG 117 

GTG ATG GTG GCA AAC GAT CTT CGC GCC CTG ACT GAT CCT 156 
TTA GTG AGC GAT GTT CTG CGC GCA TTA ACC GTC AGC CCC 195 
GAC CAG GTG CTG CAA .CTG ACG CCA GAA AAA ATC GCG ATG 234 
CTG CCG CAA GGC AGT CAC TGC AAC AGT TGG CGG TTG GGT 273 
3 5 ACT GAC GAA CCG CTA TCA CTG GAA GGC GCT CAG GTG GCA 312 
TCA CCG GCG CTC ACC GAT TTA CGG GCA AAC CCA ACG GCA 351 
CQC GCC GCG TTA TGG CAA. CAA ATT TGC ACA TAT GAA CAC 390 
GAT TTC TTC CCT GGA AAC GAC 411 



In addition to the normal sequence for the genomic material, the 
gene also contains an internal Ndel site. 

The sequence above is preceded by an upstream sequence 
containing two underlined RNA polymerase promoter signals (TTGGCG 
5 and TATATT), and a Shine Dalgarno (AGGAG) sequence. The complete 
upstream sequence ,is; 

GQCGATTATA GOCM2YIGJT GGCGCQGTA OGAOGAATTT GCTATATTTG 50 
CX3COCCTGAC AACAGGAGCG ATTCGGT 77. 

In addition, the open reading frame is followed by a downstream 
1 0 sequence beginning with a stop codon: 

TGA TTTACCQGCA GCTTACCACA TTGAACAACG CGCCCACQCC TTTCCGTGGA 53 
GTGAAAAAAC GTTTQCCAGC AACCAGGGCG AGCGTTATCT CAACTTTCAG 103. 

The *¥ gene was then produced by PCR using £ coli genomic DNA 
and the following (5'->3') primers: 
15 primer 1 (Psi-N): 

GATTCCATAT GACATCCCGA CGAGACT 27; and 

primer 2 (Psi-C): 
GACISSA1CC CTGCAGGCCG GTGAATGAGT 30 

As can be seen, primer 1 contains a Ndel site, and primer 2 
20 contains a BamHI site which have been underlined above. 

The PCR-produced DNA was used to clone the ¥ gene into pET-3c 
expression plasmid using a two-step cloning procedure necessitated by 
the internal Ndel site in the nucleic acid sequence. Briefly this 
procedure involved cutting the PCR product with Ndel restriction 
25 enzyme into two portions of 379 (Ndel to Ndel) and 543 (from Ndel to 
BamHI) bp. The 543 bp portion was ligated into plasmid pET-3c (4638 
bp) to form an Intermediate pET-3ca (5217 bp). The pET-3ca was then 
linearized, and the 379 bp portion inserted to form the desired pET-3c 
plasmid containing the complete PCR product insert. 
3 0 The overexpressioh vector containing the complete insert was 

then inserted into E. coli, and induced with IPTG as described herein, 
and overexpression (an increase to over 20% of total bacterial protein) 
of the ¥ protein was seen. 



The ¥ protein was purified by first dissolving the cell membrane 
debris in 6 M urea followed by passing the resulting solutions through a 
hydroxylapetite column, which had been equilibrate previously with a 6 
M urea buffer (180 g urea, 12.5 ml 1 M Tris at pH 7.5, .5 ml of 0.5 M 
5 EDTA, and 1 ml of 1 M DTT), wherein the ¥ peptide 1 Will flow through 
while almost everything else in solution will be held within the column. 
The The V peptide outflow of the hydroxylapetite column Was then 
bound to a DEAE column, rinsed with buffer, and eluted With a gradient 
of NaCl. Fractions containing the T peptide were pooled, dialyzed twice 
1 0 against 1 liter of buffer, and loaded onto a hexylamine column for final 
purification. Fractions from the hexylamine column containing the 4* 
peptide were eluted with a NaCl gradient (0.0 to 0.5 M), pooled and 
saved as pure ¥ subunit peptide. 

Studies were also conducted to determine that the V gene 
1 5 according to the present invention encodes ¥ subunit peptide. These 
studied determined that the N-terminal analysis of native * peptide is 
predicted by the 4* gene sequence according to the present invention; 
native ¥ peptide was obtained and digested with trypsin and a few of 
the resulting peptides synthesized - the sequenced peptides Were 
20 encoded by the gene sequence according to the present invention; the 
cloned/overproduced/pure H» peptide made in accordance with the 
present invention comigrated with the ¥ subunit peptide Within the 
naturally occurring holoenzyme; and the ¥ peptide produced from the 
sequence according to the present invention formed a yxV complex when 
25 mixed With ya'ndx as would occur with natural components. 

The yxV complex was purified [see J. Bio Chem. 265:1179 (1990)] 
from 1.3 kg of the y/x overproducing strain (HB101 (pNT203, pSK100). 
The y subunit was prepared from y and % by electrophoresis in a 15% 
SDS-polyacrylamide gel, then y was electrobiotted onto PVDF 
3 0 Membrane for N-terminal sequencing (220 pmol), and onto 

nitrocellulose membrane for tryptic digestion (300 pmol) followed by 
sequence analysis of tryptic peptides. Proteins were visualized by 
Ponceau S stain. The N-terminal analysis was determined to be : 
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NH2TSRRDDQLQQLGIT. Two internal tryptic peptides were determined 
to be: 

NH2-Leu Gly Thr Asp Glu Pro Leu Ser Leu Glu Glu Ala Gin Val Ala 
5 5 10 15 

Ser Pro; and 

¥-2: 

NH2-Ala Ala Leu Trp'.Gln Gin He Cys Thr Tyr Glu His Asp Phe Phe 

5 10 15 

10 Pro Ala 

A 3.2 kb Pstl/BamHI (DNA modification enzymes, NeW Endland 
Biolabs) fragment containing holD was excised from X5CI and ligated 
directionaliy into the polyiinker of Blue Script (Stratagene). The i.Bkb 
Accl fragment (one site is in the vector and one is in the insert) 

1 5 containing hofD was excised, the ends filled in using KlenovV 

polymerase, then ligated into pUC18 (cut with Accl and the end filled 
with Klenow) to yield pUOy. Both strands of DNA Were sequenced by 
the chain termination method of Sanger using the sequenase kit 
[or 32 P] dATP (radiochemicals, New England Nuclear), and synthetic DNA 

2 0 17-mer (Oiigos etc, Inc.). 

A 922 bp fragment was amplified from genomic DNA (strain 
C600) using Vent DNA polymerase and two synthetic primers, an 
upstream 32-mer (CAACAGGAGCGATTCCAIAIGA-CATCCCGACG), and a 
downstream 30-mer (GATTCGGAICCCTGCAGGCCG-GTGAATGAGT). The 

2 5 first two nucleotides in the Ndel site (underlined) of the upstream 32* 

mer and the first 1 1 nucleotides of the downstrearii 30-mer (including 
the underlined BamHI sequence) are not complimentary to the genomic 
DNA. Amplification was performed using a thermocycler in a volume of 
100|iil containing 1 ng genomic DNA, 1|iM each primer, and 2.5 units of 

3 0 Vent polymerase in 10 mM Tris-HCi (pH 8.3), 2 mM MgS04, 200 \iU each 

dATP, dCTP, dGTP and TTP. Twenty five cycles were performed: 1 
minute at 94°C, 1 minute at 42°C, 2 minutes at 72°C. Amplified DNA 
was phenol extracted, ethanol precipitated, then digested with x 50 units 
Ndel in 100|ii 20 mM Tris-acetate, 10 mM magnesium acetate, 1 mM 
3 5 DTT and 50 mM potassium acetate (final pH 7.9), overnight at 37°C. 




After confirming the Ndel digestion by agarose gel, 50 units of BamHI 
wa§ added and digestion was cdntinued fof 2 hours. Th€i Ndel/Ndel 
ffagment.which contained most of the ho!D gene, and the Ndel/BamHI 
fragment were separated in an agarose gel, etectroeluted, 
5 phefiol/chloroform extracted, ethanol precipitated and dissolved in 10 
rtfM Tfls-HCl (pH 7.5), i rtiM EDTA. The holD gene was ctonfed into pET3c 
in tWo steps. First the Ndel/Baml fragment encoding the G-termlnus of 
y was ligated into pET3c digested with Ndel and BamHI to generate 
pETyc-ter' (linearized with Ndel and dephosphorylated) to yield the 

1 0 pET-^ overproducer. DNA sequencing of the pET-y confirmed the 
correct orientation of the Ndel/Nde! fragment. 

The 25 yd assay contained 72 ng M13mp18 ssDNA (0.03 pmol as 
circles) primed with a synthetic DNA 30-mer, 0.98 \ig SSB (13.6 pmot 
as tetramer), 82 ng a e (0.52 pmol), and 33 ng (3 (0.29 pmol as dimer) in 

15 20 mM Tris-HCI (pH 7.5), 8 mM MgCl2, 40 mM NaCl, 5 mM DTT, 0.1 mM 
EDTA, 40 jig/ml BSA, 0.5 mM ATP, and 60 each dCTP and dGTP. 
Addition of x> V and ySS' complex to the assay was as follows. The y55* 
complex, x and \j/ (y was initially in 4 M urea) subuniis Were 
preincubated before addition to the assay for 30 minutes at 4°C at 

20 concentrations of 2.4 jjig/rnt ySS' complex (14.2 nMj, 0-0.75 jxg/ml x (45 
nM), and 0-0.75 (ig/mi y(0-48 nM) in 25 mM Tris-HCl (pH 7.5), 2 mM 
DTT, 0.5 mM EDTA, 50 \ig/m\ BSA, 20%. glycerol (buffer B) (the 
concentration of urea in the preincubation was 8.5 mM or less). One- 
half |il of this protein mixture was added to the assay (urea was 0.17 

2 5 mM or less in the assay after addition of y) then the assay was shifted 

to 37°C for 5 minutes to allow polymerase assembly before initiating 
DNA synthesis upon addition of dATP and [a" 32 P] dTTP to 60 ^M and 20 
jtM, respectively. After 20 seconds, DNA synthesis was quenched and 
cjuahtitated as described in the accompanying report. Assays to 

3 0 quafititate 9 in purification were performed likewise except the protein 

preincubation contained 2.4 |ig/ml 788" (14.2 nM), 0.75 \igfm\ % (45 nM) 
arid up to 0.25 }.ig/ml of protein fraction containing 6. After the 30 
minute preincubation, 0,5 was added to the assay reaction. The SSB, 
a, e, p, y, and t subunits used in these studies were purified, and the %, 5 



and 8' subunits were 'prepared from their respective overproducing 
strains. Concentrations of p, 8, 8', x and y were determined from their 
absorbance at 280nm using their molar extinction coefficients: p, 
17,900 M- 1 cm- 1 ;8, 46,137 M-1crrr 1 ;8\ 60,136 fvHcnr 1 ; x, 29,160 M~ 

5 1 cm-' 1 ; and y, 24,040 M" 1 cm -1 . 

The |xl assay icontained 140 ng Ml3mp18 ssDNA in 25 mM Tris- 
HCI (pH 7.5), and 8 mM MgCi2, 50 |iM {^.pjATP, 5.45 pmol yort (as 
dirriers), 10.9 pmol x and/ or \jr (as monomers) (unless indicated 
otherwise) and 1.4 ng SSB (19.4 pmol as tetramer) ( when present). 
'10 Mixtures of proteins (y Was initially 2 mg/ml (0.13 mM) in 4 M urea) 
Were preincubated 30 minutes on ice at 3.8 |iM of each subunit (as - 
monomer) in 30 nl of 25 mM Tris-HCI (pH 7.5), 0.5 mM EDTA, 20% 
glycerol (0.1 M urea final concentration) before addition to the assay 
(15 mM urea final concentration). Assays were incubated at 37°C for 

15 60 minutes 5 minutes for assays containing x) then quenched upon 
Spotting 0.5 p.! on polyethyleneimine thin layer plates (Brinkman 
Instruments Co.). After chromatography in 0.5 M LiCI, 1 M formic acid, 
the tree phosphate at the solvent front and ATP remaining hear the 
origin were quantitated by liquid scintillation, 

20 Samples of y (45 ug, 3 nmol as monomer (initially in 4 M urea)), 

or a mixture of y (45 |ig, 3 nmol as monomer) with either y (65 jig, 0.7 
nmol as dimer) or t (98 0.7 nmol as dimer) were Incubated in 200 fx! 
25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 1 mM EDTA, 10% glycerol (0.5 M 
urea Was present after addition of y) for 30 minutes at 15°C. The 

25 200U.I sample was injected onto a Pharmacia HR 10/30 gel filtration 

column of either Superdex 75 or Superose 12 at a flow rate of 0.35 

mi/min in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 1 mM EDTA, 10% 

glycerol., After the first 5.6 ml, fractions of 170 \i\ Were collected and 

analyzed In a 15% SDS polyacrylamide gel and the value of Kav was 

3 0 calculated. 

A sample of y (45 ng, 3 nmol as monomer, initially in 4 M urea) in 

200 UJ 25 mM Tris-HCI (ph 7.5), 0.1 M NaCI, 1 mM EDTA, 5% glycerol (0.5 
M urea final concentration after addition of \jr) was layered onto a 12.3 
ml gradient of 10%-30% glycerol in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI, 




1 mM EDTA. Protein .standards in 200 jxl of the same buffer were loaded 
in another tube and the gradients were centrifuged at 270,000 x g for 
44 hours at 4°C. Fractions of 150 \i\ were collected and analyzed in a 
15% SDS polyacrylamide gel stained with Coomassie Blue. 
5 The yS8' complex Was formed upon incubation of 60 \ig 5 (1.55 

nftiol as monomer) 60 fxg 8' (1.62 nmol as monomer) with an excess 
of y (600 |ig, 6.4 nhol.aS dimer) for 30 minutes at 15°C In 1 ml of 25 
ntM THs-HCl (pH 7.5), 2 mM DTT, 0.5 mM EDTA, 20% glycerol (buffer A). 
The mixture was chromatographed on a 1 ml HR 5/5 MonoQ column, and 
1 0 eluted with 30 ml linear gradient of 0 M- 0.4 M NaCI in buffer A. The 
ySB 1 complex eluted at an unique position, after the elution of free S\ 8 
and y (in that order) and was well resolved from the excess y. The pure 
yS3' complex was dialyied against buffer A to remove salt Protein 
concentration was determined using BSA as a standard. Molarity of ySS' 
15 Was calculated from protein concentration assuming the 170 kDa mass 
6f a fcomplex with subunit composition 72818*1, the composition 
expected from stoichiortietry of subunits in the y complex. 

The yxv complex, Was prepared from 1.3 kg of E. cofi and the v 
subunit was resolved from the y and x subunits in a SDS-polyacrylamide 
20 gel, then electroblotted onto PVDF membrane for analysis of the amino 
acid sequence of the amino terminus of \}/. The y was also 
electroblotted onto nitrocellulose followed by tryptic digestion, HPLC 
purification of peptides and sequence analysis of two tryptic peptides. 
Search of the GenBank for DMA sequences encoding these peptides 
25 identified a sequence which was published in a study of the riml gene 
[see Mol Gen Genet 209:481 (1987)]. In order to define the operon 
structure of this DNA, the DNA upstream of riml Was sequenced. All 
three peptide sequences of \y were in one reading frame located 
immediately upstream of riml at 99.3 minutes on the E. coti 
30 chromosome which putatively encodes y and referred to as holD. 

The promoter for holD underlined in the sequence has been 
Identified previously as the promoter for the riml gene, encoding the 
Scetylase of ribosomal protein S18, which initiates 29 nucleotides 
inside of holD. Hence, holD is in an operon of riml Production of \y was 
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inefficient relative to pi|il protein as judged by the maxicell technique 
which detected riml protein but not The promoter measured by 
Northern analysis was strong [see Mol Gen Genet 209:481 (1987)] and 
the Shine-Dalgarno sequence is a good match to the consensus sequence, 
5 as is the spacing from the ATG needed for sufficient translation. 
Although the cellular, abundance of y is not known, if one assumes all 
the vjr sequestered within the holoenzyme, then it is present in very 
small amounts, there Being only 10-20 copies of the holoenzyme to a 
cell. Perhaps the 3-1 1 fold more frequent use of some rare codons may 
t 0 contribute to inefficient translation (Leu (UUA), Ser (UCA and AGU), Pro 
(CCU and CCC), Thr (ACA), Arg (CGA and CGG)). 

The open reading frame of holD encodes a 137 amino acid protein 
of 15,174 Da. Amino terminal analysis of the y protein Within the yxV 
complex showed it was missing the initiating methionine. The molar 
1 5 extinction coefficient of y calculated from its 4 Trp and 1 tyr is 
24,040 M- 1 cnr 1 . There is a potential for a leucine zipper at amino 
acid residues 25-53, although three prolines fall within the possible 
leucine zipper. There .is also a helix-turn-helix motif (A/GX3GX5I/V) at 
G22G26I33. but again prolines may preclude helix formation. There is 
20 no apparent nucleotide binding site or zinc finger motif. 

The polymerase chain reaction was used to amplify holD from 
genomic DNA. The synthetic DNA oligonucleotides used as primers were 
designed such that an Ndel site was formed at the initiating ATG of 
holD and a Bam HI site was formed downstream of holD. The amplified 
25 holD gene was inserted into the Ndel/BamHI sites of pET3c in two steps 
to yield pET-y in which holD is under control of a strong T7 promoter 
and is in a favorable context for translation. The sequence of holD in 
pET-y was found to be identical to that depicted in the sequence, and 
transformation into BL21(DE)plysS cells and subsequent induction of T7 
3 0 RNA polymerase with IPTG, the y protein was expressed to 
approximately 20% of total cell protein. 

The y protein was completely insoluble and resisted attempts to 
obtain even detectable amounts of soluble y (lower temperature during 
induction, shorter induction time, and extraction of the cell lysate with 
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1 M NaCI were tested); it was necessary to resort to solubilization of 
the induced cell debris with 6 M urea followed by column 
chromatography in urea. The y was approximately 40% of total protein 
in the solubilized cell debris and was purified to apparent homogeneity 
5 upon flowing it through hydroxyapatite, followed by coiumn 

chromatography on DEAE sepharose and heparin agarose. By this 
procedure, 22 mg of pure \|/ was obtained from 1 liter of cell culture in 
61% yield. The pure y remained in solution upon complete removal of 
the urea by dialysis as described in greater detail below. 
10 Four liters of E. coli cells (BL21(DE3)plysS) harboring the pET-y 

plasmid were grown at 37°C in LB media supplemented with i% glucose, 
10 jig/ml thiamin, 50 jig/ml thymine, 100 |ig/ml ampicillin and 30 
|ig/ml chloramphenicol. Upon reaching an OD of 1.0, IPTG was added to 
0.4 mM and after an additional 2 hours of growth at 37°C, the cells 

1 5 were harvested by centrifugation (20 g wet weight), resuspended in 20 

ml of 50 mM Tris-HCI (pH 7.5) and 10% sucrose (Tris-sucrose) and 
frozen at -70°C. The cells lysed upon thawing (due to lysozyme formed 
by the plysS plasmid), and the following components were added on ice 
to pack the DNA and precipitate the cell debris: 69 ml Tris-sucrose, 1.2 

2 0 ml unneutralized 2 M Tris base, 0.2 ml 1 M DTT, and 9 ml of heat lysis 

buffer (0.3 M spermidine, 1 M NaCI, 50 mM Tris-HCI (pH 7.5), 10% 
sucrose). After 30 minutes incubation on ice, the suspension was 
centrifuged in a GSA rotor at 10,000 rpm for 1 hour at 4°C. The cell 
debris pellet was resuspended in 50 ml buffer B using a dounce 

2 5 homogenizer (B pestle), then sonicated until the viscosity was greatly 

reduced (approximately 2 minutes total in 15 second intervals) and 
centrifuged in twb tubes at 18,000 rpm in a SS34 rotor for 30 minutes 
at 4°C. The pellet was resuspended in 50 ml buffer B containing 1 M 
NaCI using the dounce homogenizer, then pelleted again. This was 

3 0 repeated, and followed again by homogenizing the pellet once again in 

50 ml buffer B and pelleted as was done initially. The following 
procedures were at 4°C using only one-fourth of the pellet (equivalent 
to 1 liter of the cell culture). The assay for y is described above, and 
column fractions were analyzed in 15% SDS polyacrylamide gels to 
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directly visualized the- y protein and aid the exclusion of contaminants 
during the pooling of column fractions. The pellet was solubilized in 25 
ml buffer A containing freshly deionized 6 M urea. The solubilized 
pellet fraction (fraction I, 85 mg, 22 ml) was passed over a 10 ml 
5 column of hydroxyapatite and equilibrated in buffer A plus 6 M urea. 
Th§ \jf quantitatively flowed through the hydroxyapatite column giving 
substantial purification. The protein which flowed through the 
hydrdxyapatite column was immediately loaded onto a 10 ml column of 
DEAE sephacel, equilibrated in buffer A containing 6 M freshly deionized 
10 urea, and eluted with a 100 ml gradient of 0-0.5 M NaCI in buffer A 
containing 6 M freshly deionized urea over a period of 4 hours. 
Fractions of t.25 nil were collected and analyzed for y as described. 
Fractions were pooled and dialyzed overnight against 2 liters of buffer 
A containing 3 M freshly deionized urea and then loaded onto a 10 mt 

1 5 column of hexylamine sepharose. The hexylamine column was eluted 

With a 200 ml gradient of 0 M-0.5 M NaCI in buffer B containing 3 M 
freshly deionized urea over a period of 4 hours. Eighty fractions were 
collected (2.5 ml edch). and were analyzed for then fractions 
containing y were pooled (fraction IV, 21.6 mg) and urea was removed 
20 by extensive dialysis against 25 mM Tris-HCI (pH 7.5), 0,1 M NaCI, 0.5 
mM EDTA (3 changes of 2 liters each). Protein concentration was 
determined using BSA as a standard, except at the last step in which a 
triors accurate assessment of concentration was performed by 
absorbance using the value e280 equal 24,040 M" 1 cm" 1 calculated from 

2 5 the Sequence of holD. After the absorbance measurement, DTT was 

added back to 5 mM and the y was aliquoted and stored at -70°G. 



Fraction 



10 



15 



20 



25 



30 



35 



total total specific fold 

protein units 1 activity purifica- 
(mg) (units/mg) tion 



yield 



I Solubilized 

pellet 85.0 

II Hydroxylapatite , , 

42.5 

III DEAE Sepharose 

30.6 

IV Hexlyamine 
Sepharose 21.6 



1 04.7x1 0 7 
95.9x1 0 7 
89.7x10 7 



1 2.0x1 0 6 
22.6x10 6 
29.3X10 6 
29.6x10 6 



1.0 
1.8 
2.4 
2.4 



100 



92 



86 
61 



63.9x10 7 

i<Dne unit is defined as pmol of nucleotide incorporated in one minute 
over and above the pmol incorporated in the assay in the absence of 
added y 

The pure y protein comigrated with the y subunit of pollll* 
(holoenzyme lacking only IB) in a 15% SDS-polyacrylamide gel. Analysis 
of the N-terminal sequence of the pure cloned \y matched that of the 
holD sequence and the sequence of the natural \|r from within the yxy 
complex indicating that the purified protein encoded by the gene had 
been cloned. 

The pure \j/ appeared fully soluble in the absence of urea. 
However, a 2 mg/ml solution of \|/ which appeared clear, and could not 
be sedimented in a table top centrifuge, still behaved as an aggregate in 
a gel filtration column. Therefore, even though y appeared soluble it 
was still an aggregate. The aggregated y had only weak activity in the 
replication assay and was Inefficient in binding to other proteins in 
physical studies. Therefore before using in assays or in physical 
binding experiments, urea was added to a concentration of 4 M to 
disaggregate \y . Once disaggregated, the urea could be quickly removed 
by gel filtration and y behaved well during filtration in the absence of 
urea in the column buffer. However, upon standing a full day at high 
concentration (>1 mg/ml) in the absence of urea, it would aggregate 
again, y would work In urea provided the preparation was sufficiently 
concentrated 2 mg/ml) such that it could be diluted at least 8-fold (to 
0.5 M urea) for protein-protein interaction studies, 300-fold for 
ATPase assays, and 30,000-fold for replication assays. In 0.5 M urea, 



the v bound to y and % t and also to the % subunit. y treated in this 
manner was also functional in stoichiometric amounts with other 
proteins in replication &hd ATPase assays. 

In a previous study, a yxV complex was purified by resolving the 8 
dnd 8' subunits out of the y complex leaving only a complex of yxy. 
Compared to y, this,yxv complex was approximately 3-fold mor6 active 
In reconstituting the processive polymerase with 8 ( B, and ae at 
Stevated salt concentration. The simplest explanation for this result is 
that at elevated salt, yxy8 is more active than yS in assembling the B 
ring around primed DNA. 

The present invention indicates that a mixture of the y 9 8 and 8'" 
subunits formed a stabile (gel filterable) y88' complex when the a e 
complex and p subunit were incubated with the y88' complex (with or 
without % and/or y) in a reaction containing SSB "coated" Ml3mp18 
ssDNA primed with a synthetic DNA oligonucleotide and in the presence 
of 40 mM added NaCI, and the reaction was incubated at 37°C for 5 
minutes to allow the accessory proteins time to assemble the 
preinitiation complex clamp and for the a e to bind the preinitiation 
complex (the preinitiation complex is known to consist of a p dimer 
ring clamped onto the DNA). Replication of the circular DNA was then 
initiated upon addition of the remaining dNTPs and was quenched after 
20 seconds, sufficient time for the rapid and processive holoenzyme to 
complete the circle. 

The results indicated that as \jr is titrated into the assay the 
replication activity increased approximately 3.5-fold and plateaued at 
approximately 1 mol y (as monomer) per mol y88' complex. \p (toiihout x) 
Stimulates ySS 1 and x does not stimulate the reaction, but the presence 
Of both x and y yields the most synthesis as though x does exert an 
influence on the assay but only when y is also present. 

Previously y was observed to contain a low level of DNA 
dependent ATPase activity (0.1 mol ATP hydrolyzed/mol y/minute) 
compared to the ATPase of the y complex (6.8 mol ATP/mol y . 
complex/minute). The yxv complex resolved out of the y complex 
appeared to contain approximately 3-4 fold more DNA dependent ATPase 




activity than y suggesting that % and/or y stimulated the ATPase 
activity of y, or that there Was an ATPase activity inherent within % 
and/or Now that the. holG and holD genes have made available pure % 
and v in quantity, they have been studied studied them for ATPase 
5 activity and tor their effect on the DNA dependent ATPase activity ofy. 
As part of thp$6 studies of ATPase activity* all possible 
combinations of y arjd ^ have been tested. These assays Were 
performed in the presence of Ml3mp18 ssDNA, one of the best DNA 
effectors in the previous study of the y complex ATPase activity. The 

1 0 results showed that y alone, % alone, and a mixture of x and y had no 
detectable ATPase activity and therefore neither \|f nor % Would appear 
to have an intrinsic ATPase activity, although on the basis of negative 
evidence we can not rule out the possibility of a cryptic ATPase; the y 
subunit has a weak ATPase activity. The % subunit has no effect on the 

1 5 ATPase activity of y. However, addition of y to y stimulated the ATPase 
activity of y approximately 3-fold. Titration of y into the ATPase assay 
showed y saturated the ATPase assay at approximately 2 mol y (as 
mohomer) to 1 mol y (as dimer). Addition of the % subunit to the yy 
mixture resulted in a further 30% increase in ATPase activity. 

20 In the presence of SSB which "coats" the ssDNA, the ATPase 

activity of y, yy and y%y were all greatly reduced (50-fold). However, of 
the remaining activity, the yxv complex was 4-fold more active than yy 
showing that % significantly stimulates the y\jr ATPase which the DNA is 
"coated" with SSB. 

25 The ATPase assay of \y and x was extended to the DNA dependent 

ATPase activity of the x subunit. The x and y subunits are encoded by the 
same gene and, as a result, x contains the y sequence plus approximately 
another 24 kDa of protein which is responsible for both the ability of x 
to bind DNA and to bind the polymerase subunit, ct. In addition, x has a 

3 0 much greater DNA dependent ATPase activity thart y, approximately 6- 
10 mol ATP hydrolyzed/mol x/minute for a 60-fold greater activity of 
x relative to y. 

Neither y t x« or a mixture of x and y had a significant influence on 
the ATPase activity of x. "Coating" the ssDNA with SSB reduced the 
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ATPase activity of x 20-fold t and now the % and y subunits stimulated 
the x ATPase 10-fold to bring its activity back to about half of Its value 
in the absence of SSB. In this case, with SSB present, the y stimulated 
x approximately 3-fold, and x> which had no effect oh x without y, 
5 stimulated the yvj; ATPase another 3-fold. 

To gain a better understanding of the y molecule the present 
invention studied the .hydrodynamic properties ot In gel filtration and 
glycerol gradient sedimentation to determine whether tyis a hionomer 
Or a dimer (or larger). The Stokes radius of y was 19A upon comparing 
10 Its position of elution from a gel filtration column with that of protein 
Standard of know Stokes radius. The y eluted in the same position as 
myoglobin (17.5 kDa) indicating y is a 15 kDa monomer rather than a 
dinri&i- of 30 kDa. The protein sedimented witti an S value of 1.95 
relative to several protein standards, and was slightly slower than 

1 5 myoglobin which is consistent with y as a monomer. If a protein has an 

asyfnmetric shape* its migration will not reflect its true Weight in 
either of these techniques. However the effect of asymmetric shape 
has opposite effects in. these techniques and can be eliminated by the 
fact that the shape factor cancels when the S value and Stokes radius 

2 0 are both combined in one mass equation. This calculation results in a 

native molecular mass for y of 15.76 kDa, close to the 15 kDa 
monomeric mass of v|/ calculated from its gene sequence. Hence y 
behaves as a monomer under these conditions. The frictional 
coefficient of y calculated from its Stokes radius and native mass is 
15 1.13, slightly greater than 1.0 which indicates some asymmetry in the 
shape of 

Although the initial use of 4 M urea would have monomerized y if 
it were a native dimer, the y preparation was diluted such that the 
concentration of urea was 0.5 M before it was applied to either the gel 

3 0 filtration column or the glycerol gradient, and the buffer used in the 

column and in the gradient contained no urea. Of course, one should 
still be concerned that 0.5 M urea is high enough to disaggregate a 
dimer of y and that the dimer hasn't time to reassociate during 
filtration and sedimentation. Yet under these very conditions it was 




found that \|/ forms a- protein-protein complex with y, with x and also 
with x- Therefore it seems likely that if y were haturally a dimer, that 
thd dimer could have reformed under these same conditions under which 
Y can bind all these other subunits. Further, a moriomeric nature of y is 
5 not unusual as most subunits of the holoenzyme ate monomers when 
isolated (a, e, 9, x> M 1 , (only y, % and p are dimers). 

A complex of y%y can be purified from cells indicating that y or % 
(or both) must directly interact with y. 

Gel filtration of a mixture of y with a 4-fold molar excess of y 

1 0 showed that y coeluted with the y subunit followed later by the eiution 

of the rest of the excess ty. Hence, the y subunit does in fact bind 
directly to y. 

The fifth subunit according to the present invention, %, began 
With the N-terminal analysis of x which provided a sequence a portion 

15 of which, was found to have been related, in part, to the sequence of the 
xerB gene [see EMBO 8(5):1623 (1989)]. Although not included in the 
1S§2 bp sequence in the publication, a fuller more complete sequence 
(from 1 to 2035) of the. xerB gene was provided to GenBank. In this 
submission, the "front-end" portion of the x gene according to the 

20 present invention was presented. However, in neither the publication 
nor in GenBank was the "front-end" portion as coding for a protein. 
Based upon the molecular weight of x as determined in a SDS-PAG gel 
analysis, the "front-end" portion reported in GenBank predicts 
approximately 70% of the expected length of x- 

2 5 A subsequent literature study located a gene named valS which 

Was located downstream of the xerB gene. It appeared (and was 
confirmed during the research leading to the present invention) that the 
X, in its entirety, must be located between the xerB and valS genes. 
Edman degradation amino acid sequencing Was performed on an 

3 0 Applied Biosystems 470A gas phase microsequencing apparatus. The 

yxv complex of the holoenzyme Was purified, and 10 \ig was 
eleetrophoresed in a 15% polyacrylamide gel [see Nature 227:680 
(1970)] followed by transfer to an Immobilon membrane PVDF 
(Millipore) for N-terminal sequence analysis as with the previous 




Subunits according to > the present invention. Interna! peptide sequences 
were obtained by electrophoresis of 10 \ig of the ?xy complex in 15% 
polyacrylamide gel* followed by transfer to nitrocellulose membrane, 
digestion by trypsin Jn situ, and analysis by gas phase microsequencing. 
5 The XcOr holC gene* according to the present Invention, is located 

at 96.5 minutes on the: E. coli chromosome and encodes a 147 amino acid 
protein of 16.6 kDa. , 

The recombinant X phage 5C4 from the overlapping X library of 
Kohara [see Cell 50:495 (1987)] was used in determining the DNA 
1 0 sequence of the x gene. The DNA fragment containing the % 9 en e was 
identified and cloned into pUC18 using conventional techniques. The_ 
DNA sequence for both strands of the % 9 ene were performed on the 
dubiex plasmid by the dideoxy chain termination method of Sanger using 
the Sequenase kit; sequencing reactions were analyzed on 6% 

1 5 polyacrylamide, 50% (w/v) urea gels. 

The sequence of the primers (5'->3*) used for PGR amplification 
of the x gene during the cloning of the x gene are as follows: 

30-mer primer: , 
C03CA£MML^AAAMCGCG ACGITCTACC 30; 
20 28-mer primer: 

ACCCSGMIII_AAACTGCCGG TGACATTC 28 

The 30-mer hybridizes over the initiation codon of x> and a two 
nucleotide mismatch results in a Ndel restriction site (underlined) at 
the ATG initiation codon upon amplification of the gene. The 28-mer 

2 5 ahheals within the valS gene downstream of the c gene; a BamHl 

restriction site (underlined) is embedded within the six nucleotides 
which do not hybridize to valS. 

Using these codons, subsequent studies isolated the c gene 
Sequence which is, according to the present invention: 

3 0 ATG AAA AAC GCG ACG TTC TAC CTT CTG GAC AAT GAC ACC 39 

ACC GTC GAT GGC TTA AGC GCC GTT GAG CAC CTG GTG TGT 78 
GAA ATT GCC GCA GAA CGT TGG CGC AGC GGT AAG CGC GTG 117 
CTC ATC GCC TGT GAA GAT GAA AAG CAG GCT TAC GCC CTG 156 
GAT GAA GCC CTG TGG GCG CGT CCG GCA GAA AGC TTT GTT 195 



CCG CAT AAT TTA GCG, GGA GAA GGA CCG CGC GGC GGT GTA 234 
CCG GTG GAG ATC GCC TGG CCG CAA AAG CGT AGC AGC AGC 273 
CGG CGC GAT ATA TTG ATT AGT CTG CGA ACA AGC TTT GGA 312 
GAT TTT GCC ACC GCT TTT ACA GAA GTG GTA GAC TTC GTT 351 
5 CCT CAT GAA GAT TCT CTG AAA CAA CTG GCG CGC GAA CGC 390 
TAT AAA GCC TAC CGC, GTG GCT GGT TTC AAC CTG AAT ACG 429 
GGA ACT TGG AAA 441, 

The upstream pdrtion of the holC gene is: 
TAACGGCGAA GAGTAATTGC GTCAGGCAAG GCTGTTATJ1G_JXGGATGCGG 50 
1 0 CGTGAACGCC TTATCCGA CC TACACAGCAC TGAACTCGTA GGCCTGATAA 100 < 
GACACAACAG CGTCGCATCA GGCGCTGCGG TGTATACCTG ATGCGTATTT 150 
AAATCCACCA CAAGAAGCOC CATTT 175 

The downstream sequence begins with the stop codoh: 
T AA TG GAAAA GACATATAAC CCACAAGATA TCGAACAGCC 40 

1 5 GCTTTACGAG CACTGGGAAA AAAGCCAGGA MGTITCTGC 80 

ATCATGATCC CGCCGCCGAA 100 

The underlined nucleotide sequences indicate the potential Shine- 
Dalgamo sequence (AAGAAG) of holC and the nearest possible promoter 
signals (TTG CCG and TATCCG) are highlighted in the first Underlined 

2 0 region. The stop codo'n of the upstream XerB gene (TAA) and the start 

codon of the downstream ValS gene (ATG) are each double underlined. 
This translated into the correct peptide which is: 



Met Lys Asp Ala 


Thr Phe Tyr 


Leu Leu 


Asp Asn Asp 


Thr 


Thr Val 




5 




10 




15 


Asp Gly Leu Ser 


Ala Val Glu 


Gin Leu 


Val Cys Glu 


He 


Ala Ala 




20 




25 




30 


Glu Arg Trp Arg 


Ser Gly Lys 


Arg Val 


Leu He Ala 


Cys 


Glu Asp 




35 




40 




45 


Glu Lys Gin Ala 


Tyr Arg Leu 


Asp Glu 


Ala Leu Trp 


Ala 


Arg Pro 




50 




55 




60 


Ala Glu Ser Phe 


Val Pro His 


Asn Leu 


Ala Gly Glu 


Gly 


Pro Arg 




65 




70 




75 


Gly Gly Ala Pro 


Val Glu He 


Ala Trp 


Pro Gin Lys 


Arg 


Ser Ser 




80 




85 




90 


Ser Arg Arg Asp 


He Leu He 


Ser Leu 


Arg Thr Ser 


Phe 


Ala Asp 




95 




100 




105 


Phe Ala Thr Ala 


Phe Thr Glu 


Val Val 


Asp Phe Val 


Pro 


Tyr Glu 




110 




115 




120 



o 
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Asp Ser Leu Lys Gin* Leu Ala Arg Glu Arg Tyr Lys Ala Tyr Arg 
125 130 135 

Val Ala Gly Phe Asn Leu Asn Thr Ala Thr Trp Lys 
140 145 147 

5 EXAMPLE IV 

(molecular cloning, cell growth and purification) 

PCR reactions were performed with both Vent polymerase 

(Biolabs) and Taq polymerase. In a 100 ^tl volume, the PCR reaction was 

conducted in a reaction buffer containing 10 mM Tris-HCI (pH 8.3), 50 

1 0 mM KCI, 1.5 mM MgCl2, and 0.01% (w/v) gelatin, 1.0 jiM of each primer, 

and 200 ^iM each dNTP (Pharmacia-LKB), on 1 ng B. coli genomic DNA 

(prepared from K12 strain C600) with 2.5 u polymerase. PCR 

amplification was performed in a DNA Thermal cycler model 9801 

using the following cycle: melting temperature 94°C for 1 min, 

1 5 annealing temperature 60°C for 2 min, and extension temperature 72°C 

for 2 min. After 30 cycles, the amplified DNA was purified by phenol 
extraction in 2% SDS followed by sequential digestion of 10 jig DNA 
With 10 u Ndel, followed by 10 u BamHI. The 600 bp DNA fragment was 
purified from a 0.8% agarose gel by electroelution, arid ligated Into 

2 0 pET3c previously digested with both Ndel and BamHI restriction 

enzymes. The resulting plasmids (pETx-1, pET%-2 and pETx-3) were 
ligated into E. coli strain BL21(DE3)pLysS. 

The freshly transformed BL21(DE3)pLysSpET% cells were grown 
in LB media containing 100 |ig/ml ampicillin and 30 fig/ml 

2 5 chloramphenicol at 37°C. Upon growth to an OD600 of 0-7, isopropyl B- 

D-galactopyranoside (IPTG) was added to a final concentration of 0.4 
mM. Incubation was continued for 3 hr at 37°C before harvesting the 
cells. 

Seven mg of homogeneous % protein was purified from a 4-liter 

3 0 induced culture in which nearly 30% overproduced % protein was in 

soluble form. The 4-liter culture was grown in an OD600 of 0.7 before 
addition of IPTG to 0.4 mM. After a further 3 hr incubation at 37°C, the 
cells (25 g) were harvested, resuspended in 25 ml ice-cold 50 mM 
iris/10% sucrose, and lysed by 25 mg lysozyme dh ice for 45 min and a 
3 5 subsequent incubation at 37°C for 5 min in 5 mM Tris, 1% sucrose, 30 



mM spermidine, and 100 mM NaCI. The cell lysate was clarified by 
centrifugation at 12,000 rpm for 1 hr at 4°C. All subsequent column 
chromatography procedures were at 4°C. All the columns were 
equilibrated in buffer A (20 mM Tris (pH 7.5), 0.5 mM EDTA, 5 mM DTT, 
5 and 20% glycerol). The % protein was followed through th§ purification 
process by SDS-PAQE gel analysis. Total protein was estimated [see 
Anal. Biochem 72:248,(1976)] using bovine serum albumin as & Standard. 
The soluble lysate (120 ml, 520 mg protein) was dialyzed against 4 
liter buffer A for 16 hours before being loaded onto a 60 ml heparin- 

1 0 agarose column. The fractions containing %* which eluted off the 

column during wash with buffer A, were pooled (360 ml, 365 mg - 
protein), and loaded directly onto a FPLC 26/10 Q sepharose fast flow 
column. The Q sepharose fast flow column was eluted with a 650-mI 
linear gradient of 0 M to 0.5 M NaCI in buffer A. The fractions 

1 5 containing % t eluted at approximately 0.16 M salt in a volume of 45 ml 
(60 mg protein), were pooled, dialyzed overnight against 4 liter buffer 
A, and loaded onto a 1 ml N-6 ATP-agarose column. The y complex 
(ySS'xMO binds to the cqlumn tightly due to the strong ATP binding 
capacity of y, while x protein by itself flows through. This column was 

20 included to eliminate &ny y complex from the x preparation. 

The flow-through, of the ATP-agarose column was loaded onto an 
8 ml hexylamine column and x was eluted with an 80 ml linear gradient 
of 0 to 0.5 M NaCI in buffer A. The % protein eluted at approximately 
0.25 M salt. Fractions containing the peptide (81 ml, 36 mg protein) 

25 were pooled and dialyzed against buffer A. The % protein Was loaded 
onto an 8 ml FPLC Mono Q column, and eluted with a 80 ml linear 
gradient of 0 to 0.5 M NaCI in buffer A. The fractions containing x (28 
ml 8.5 mg protein) eluted sharply at 0.16 M salt. The x protein was 
pooled and dialyzed overnight against 4 liters of buffer A, then 

3 0 aliquoted and stored at -70°C. 

The concentration of purified x protein was determined from its 
absorbance at 280 nm. The molar extinction coefficient at 28Q nm 
(e280) of a protein in its native state can be calculated from its gene 
sequence to within +/- 5% by using the equation £280 = Trp m (5690M~ 




icnr 1 ) + Tyr n (1280M-icrrr 1 ) [see Analytical Biochemistry 182:319 
(1989)] wherein m and n are the numbers of tryptophan and tyrosine 
residues, respectively, in the peptide. The molar extinction 
coefficients of tryptophan and tyrosine are known [see Biochemistry 
5 6:1948 (1967)]. For x protein where m equals 4 and n equals 5, e280 = 
29,160 M" 1 cnH. The % protein was dialyzed against buffer A 
containing no DTT prior to absorbance measurement. SDS-PAG was in 
15 6 /o polyacrylamide 0.1% SDS gel in Tris/glycine/SDS buffer [see 
Nature 227:680 (1970)]. Proteins were visualized by Coomassie 

10 Brilliant Blue Stain. 

The 7X¥ complex was purifeid from 1.3 kg of £ coli. The % subunit 
was resolved from the y and y subunits upon electorphoresis in a 15% 
SDS polyacrylamide gel followed by transfer of % onto PVDF membrane 
for N-terminal sequenc.e analysis (210 pmol), and onto nitrocellulose 

1 5 rnembrane for tryptic analysis (300 pmol). Proteins were visualized by 
Ponceau S stain. The amino terminus of c was determined to be: 
NH2~Met Lys Asn Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 

.5 10 15 

Asp Gly Leu Ser Ala Val Glu Gin Leu Val Xxx Glu He Ala 

20 20 25 

wherein Xxx is an unidentified residue. Tryptic digestion and analysis 
of four internal peptides were determined to be: 

NH2-Val Leu He Ala Xxx Glu Asp Glu Lys . 
25 , 5 

X-2: 

NH2~Leu Asp Glu Ala Leu Tip Ala Ala Pro Ala Glu Ser Phe Val Pro 

5 10 15 

His Asn Leu Ala Gly Glu 
30 20 

X-3: 

NH2~Gly Gly Ala Pro Val Glu lie Ala Trp Pro 

5 10 

3 5 NH2-Gly Phe Asn Leu Asn Thr Ala Thr 

5 
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The 3.4 kb BamHI fragment containing hotC was excised from X 
5C4 and ligated into the BamHI site of pUC-x. Both strands of the hoIC 
gene were sequenced on the duplex plasmid by the chain termination 
method of Sanger, and synthetic 17-mer DNA oligonucleotides. 
5 Sequencing reactions were analyzed on 6% (w/v) acrylamide, 50% (w/v) 
urea gels and were performed with both dGTP and DITP, 

The sequences ibf the primers used to amplify the hoIC gene were: 
upstream 30-mer: 

CCCC RCATAT G AAAAACGCG ACGTTCTACC 30 

10 Downstream 28-mer: 

ACC( XGAT(X AAACTGCCGG TGACGTTC 28 
The upstream 30-mer hybridizes over the initiation codon of hoIC, and a 
two-nucleotide mismatch results in a Ndel restriction site (underlined 
above) at the ATG initiation codon upon amplification of the gene. The 

1 5 downstream 28-mer sequence within the valS gene downstream of hoIC. 
A BamHI restriction site (underlined) is embedded in the sequence 
resulting in three nucleotides which are not complementary to valS. 
Amplification reactions .contained 1.0 |iM of each primer r 200 jiM of 
each dNTP, 1 ng £. coli genomic DNA (from strain C600), and 2.5 units of 

20 Taq I DNA polymerase in a final volume of 100 jxl 10 mM Tris-HCI (pH 
8.3), 50 mM KCI, 1,5 mM MgCl2, and 0.01% (w/v) gelatin. Amplification 
was performed in a thermal cycler using the following cycle: 94°C, 1 
minute; 60°C t 2 minutes;- and 72°C, 2 minutes. After 30 cycles, the 
amplified 604 bp DNA was purified by phenol extraction in 2% SDS 

25 followed by sequential digestion of 10 ^g DNA in 10 units of Ndel and 
then 10 units of BamHI according to manufacturer's specifications. The 
Ndel-BamHI fragment was electroeluted from a 0.8% agarose gel and 
ligated into gel pUrifted pET3c previously digested with both Ndel and 
BamHI. The resulting plasmid, pET-x was sequenced which confirmed 

3 0 that ho errors had been introduced during amplification, and it was then 
transformed into strain BL21(DE3)plysS. 

The 7 subunit was purified from an overproducing strain, and the 
5, 8' and \|/ subunits were purified from their respective overproducing 
strains as described above. A mixture of 48 |ig y (0.51 nmol as dimer), 



144 \ig 8 (3.7 nmol as'monomer}, 144 |xg 8' (3.9 nmol as monomer), and 
192 jxg v (12.7 nmol as monomer) was incubated at 15°C for 1 hour and 
then loaded onto a 1 ml HR 5/5 Mono Q column. The concentration of y 
was determined using BSA as a standard. Concentrations of 8, 5' and y 
5 were determined from their absorbance at 280 nm using the molar 
extinction coefficients' 46,137 M" 1 cm" 1 ( 60,136 M^crrr^ and 24040 
M-lcfrri* respectively! The column was eluted with si 32 ml gradient 
&l 0 M - 0.4 M NaCI in 20 mM Tris-HCI (pH 7.5), 0.5 mM EDTA, 2 mM DTT, 
and 20% glycerol (buffer A) whereupon the ySS> complex resolved from 
1 0 uficomptexed subunits by eluting later than all the rest. Eighty 

fractions were collected and analyzed by a Coomassie Blue stained 15% 
SDS polyacrylamide get. Fractions containing the y88> complex, were 
pooled, the protein concentration was determined using BSA as a 
standard, and then the ySS> complex was aliquoted and stored at -70°C. 
1 5 Molarity of ySS' was calculated from the protein concentration assuming 
the 185 kDa mass calculated from gene sequences assuming a 
stoichiometry of Y2§1 8'1Y1 as expected from the tentative 
stoichiometry of subunits in the y complex. 

The ^constitution assay contained 72 ng M13mp18 ssDNA (0.03 
20 pmol as circles) uniquely primed with a DNA 30-mer, 980 ng SSB (13.6 
pmol as tetramer), 10 ng B (0.13 p mol as dimer), 55 ng at complex 
(0.35 pmol) in a final volume (after addition of proteins) of 25 ^il 20 
mM Tris-HCI (pH 7.5), 0.1 mM EDTA, 8 mM MgCl2, 5 mM DTT, 4% glycerol, 
40 pg/ml BSA, 0.5 mM ATP, 40 mM NaCI, 60 ^iM each dCTP, dGTP, dATP, 
25 and 20 \iM [a- 32 P]. Pure x protein or column pool containing % (1-12 ng) 
was preincubated on ice for 30 minutes with 37 hg ySSy complex (0.2 
pmol) in 20 of 20 mM Tris-HCI (pH 7.5), 2 mM DTT, 0.5 mM EDTA, 20% 
glycerol, and 50 ng/ml BSA before dilution with the same buffer such 
that 0.14 ng (0.76 fmol) of the ySS> complex was added to the assay in a 
3 0 1-2 pi volume. The assay was then shifted to 37°C for 5 minutes. DNA 
synthesis was quenched by spotting directly onto DE81 filter paper and 
quantitated. The ae complex, B and SSB proteins used in the 
reconstitution assay were purified and their concentrations determined 




using BSA as a standard except for B which was determined by 
absorbance using an e280 value of 17,900 taHcnr'L 

Gel filtration analysis was performed using the Pharmacia HR 
10/30 fast protein liquid chromatography columns* Superdex 75 and 
5 Superose 12, Proteins Were incubated together for i hour at 15°C in a 
final volume of 200 .jU buffer B (25 mM Tris-HCI (pH 7.5), 1 mM EDTA, 
10% glycerol, and 100, mM NaCl). The y protein Was first brought to 4 M 
in urea to disaggregate it, and when present with other proteins the 
final concentration of urea in buffer B was 0.5 M. The entire sample 

1 0 was injected, the column was developed With buffer B, and after 
collecting the first 6 ml, fractions of 170 |xt were collected. The x 
protein was located in column fractions by analysis in 15% SDS- 
polyacrylamide gels. Densitometry of Coomassie Blue-stained gels was 
performed using a laser densitometer (Ultrascan XL). 

1 5 Individual samples of % (46 ng, 2.8 nmol as monomer) and of y (45 

ng, 3 nmol as monomer), and a mixture of % (218 ng, 13 nmol as 
monomer) and v (45 ng, 3 nmol as monomer) were incubated 30 minutes 
at 4°C in 200 (il buffet: B with 5% glycerol (samples containing y 
contained a final concentration of 0.5 M urea in the 200 \i\ as explained 

20 above). Samples were layered onto 12.3 ml gradients of 10%-30% 

glycerol in 25 mM Tris-HCI (pH 7.5), 0.1 M NaCI and 1 mM EDTA. Protein 
standards in 200 |il of buffer B with 5% glycerol were layered onto 
another gradient and the gradients were centrifuged at 270,000 x g for 
44 hours at 4°C. Fractions were collected and analyzed. 

25 The polymerase chain reaction was used to precisely clone the 

holC gene into the T7 based pET expression system [see Methods in 
Ehzymology 185:60 (1990)]. Primers upstream dnd downstream of holC 
were synthesized to amplify a 604 bp fragment containing the holC gene 
from E. coli genomic DNA. The upstream primer hybridized over the 

3 0 start codon of holC and included two mismatched nucleotides in order 
to create an Ndel restriction site at the initiating ATG. The primer 
downstream of holC included three mismatched nucleotides to create a 
BamHI restriction site. The amplified 604 bp fragment was digested 
with Ndel and BamHI and cloned into the Ndel-BamHI site of the T7 
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based expression vector pET3c to yield pET~x- In the pET-x plasmid, the 
holC gene is under the control of a strong 17 RNA polymerase promoter 
and an efficient Shine-Datgamo sequence in favorable context for 
translation initiation. DNA sequencing of the pET-x plasmid showed its 
5 sequence was identical to that of pUC-x, and therefore no errors were 
incurred during amplification. 

The pET-x expression plasmid was introduced into strain 
BL21(DE)plyS which is a lysogen carrying the T7 RNA polymerase gene 
under the control of the IPTG-inducible lac UV5 promoter. Upon 
1 0 induction with IPTG and continued growth for 3 hours, the x protein was 
expressed to a level of 27% total cell protein. Upon cell lysis, only 
about 30% of the x protein was in the soluble fraction, the rest being 
found in the cell debris. Induction at lower temperature (20°C) or for 
Shorter times did not appear to increase the proportion of % in the 

1 5 soluble fraction. 

Four liters of induced cells were lysed and 38 mg of pure x w as 
obtained in 38% overall yield upon fractionation with ammonium sulfate 
precipitation, followed by column chromatography using Q sepharose 
and heparin agarose. The x protein was well behaved throughout the 
20 purification and showed no tendency to aggregate. The N-terminal 
sequence analysis of the pure cloned x matched that of the holC gene 
indicating that the protein had been successfully cloned and purified. 
The expressed c protein also comigrated with the authentic xsubunit 
contained within pollll*. 

2 5 In summary, as a result of the present invention, the location and 

sequence of x was determined. The % subunit (400 pmol) was separated 
from y and x subunits of the yxv complex by SDS denaturation and 
resolution on a 15% polyacrylamide gel, and 100 pmol transferred to a 
PVDF immobilon membrane for amino terminal sequence analysis; the 

3 0 remainder was transferred to nitrocellulose for sequence analysis of 

internal peptides following trypsin digestion. After transfer, the 
protein was visualized by Ponceau S stain and excised from the gel. The 
sequence of the N-terminal amino acids and four internal peptides were 
determined as described above, and these sequences were used to 



0 81 6 



search the GenBank database. One single exact match was found at 
about 96.5 minutes on the E coli chromosome between the xerB and 
valS genes. 

The recombinant Kohara X clone 5C4, contains the DNA fragment 
5 encompassing the xerB and partial valS genes, and the % gene was 
Subcloned by ligation pf the BamHI fragment of X 5C4 into pUCl8. 
Sequence analysis was performed directly on the plasmid. As shown 
aboVe, the open reading frame of the % gene was 441 nucleotides long. 
Its initiation codon is 160 nucleotides downstream of the stop codon of 
1 0 the xerB gene, while its termination codon, TAA, has one base 

overlapping with the start codon of the valS gene. Since the xerB and % 
genes were transcribed in the same direction, and that no promoter 
consensus sequences were found for the % gene alone, suggests that 
these two genes are in the same operon. 

1 5 When PCR was applied to clone the % 9 ene ' nto the 17 based 

expression system, PCR primers based upon the known sequences of the 
xerB and valS genes were made to amplify the fragment between the 
two genes. As described, E coli genomic DNA was used as the PCR 
template, and a fragment of approximately 600 base pairs was 

2 0 amplified. The PCR fragment, after being digested with Ndel and BamHI, 

was cloned into the Ndel-BamHI site of the expression vector pET3c in 
similar manner to what was done with the preceding gene sequences. 
Thus, the putative x gene was put under the control of a strong T7 RNA 
polymerase promoter gene as well as the efficient translation 
25 initiation signal, and transcription termination sequence downstream 
of the BamHI site. Direct DNA sequencing of the plasmids formed 
showed that they were all identical to the sequence of the X clone. 

The resulting plasmids were transformed into E. coli 
BL2i(DE)pLysS that contained a lysogen carrying the T7 RNA polymerase 

3 0 gene under the control of the IPTG-inducible iac UV5 promoter [see 

Methods in Enzymology, 185:60 (1990)]. Transformants were selected by 
ampiciliin and chloramphenicol resistance, and subsequently subjected 
to IPTG induction as described above, A protein of about 17 kDa was 
overproduced in all three PCR clones. The y complex was run in parallel 
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with the three clones on SDS-PAG gel, and when the overproduced and 
the x subunit were at similar amounts, they showed the same gel 
mobility. This observation supported the identity of the overproduced 
protein as the % subunit. 
5 In addition to the specific sequences provided above for the 

individual genes acpoxding to the present invention, the present 
invention also extend^ to mutations, deletions and additions to these 
sequences provided such changes do not substantially affect the present 
properties of the listed sequences. 

10 As described, the naturally occurring holoenzyme consists of 10 

protein subunits and is capable of extending DNA faster than 
polymerase I, and producing a product many times larger then the 
polymerase I enzyme. Thus, these unique properties of the 5, preferably 
6, active subunits of the present invention are likely to find wide 

1 5 application in, for example, long chain PCR - using the active sequence 
according to the present invention PCR can be performed over several 
tens of kb; PCR performed at room temperature - the active sequence 
according to the present invention is uniquely adapted to be a 
polymerase of choice for PCR at room temperature due to its high 

20 fidelity; extension of site mutated primers without catalyzing strand 
displacement; and for .sequencing operations wherein other polymerases 
find difficulty. Other uses will become more apparent to those skilled 
in the art as the science of molecular genetics continues to progress. 
The sequence listing for the nucleic acid sequences and peptide 

25 sequences which are contained in the present description is as follows: 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Michael E. O'Donnell 
3 0 (ii) TITLE OF INVENTION: DNA POLYMERASE III HOLOENZYME 

(iii) NUMBER OF SEQUENCES: 60 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 28 amino acids 
3 5 (B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Leu Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn Glu 
5 10 15 

Gly Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro 
20< 25 28 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu Leu Gin 

5 ' 10 
Glu Ser Gin Asp Ala Val Arg 
15 20 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Ala Gin Glu Asn Ala Ala Trp Phe Thr Ala Leu Ala Asn Arg 

5 10 14 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SlQUENcI CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D)TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Val Glu Gin Ala Val Asn Asp Ala Ala His Phe Thr Pro Phe His 
5 10 15 

Trp Val Asp Ala Leu Leu Met Gly Lys 
20 24 
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(2) INFORMATION FOR.SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: ' .nucleic acid 
5 (C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTACAACCGA ATCATATGTT ACCCAGCGAG CTC 33 

1 0 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1032 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

1 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ATG ATT CGG TTG TAC CCG GAA CM CTC CGC GCG CAG CTC 39 
AAT GAA GGG CTG CGC GCG GCG TAT CTT TTA CTT GGT AAC 78 

2 0 GAT CCT CTG TTA TTG CAG GAA AGC CAG GAC GCT GTT CGT 117 

CAG GTA GCT GCG GCA CAA GGA TTC GAA GAA CAC CAC ACT 156 

TTT TCC ATT GAT CCC AAC ACT GAC TGG AAT GCG ATC TTT 195 

TCG TTA TGC CAG GCT' ATG AGT CTG TTT GCC AGT CGA CAA 234 

ACQ CTA TTG CTG TTG TTA CCA GAA AAC GGA CCG AAT GCG 273 

2 5 GCG ATC AAT GAG CAA CTT CTC ACA CTC ACC GGA CTT CTG 312 

CAT GAC GAC CTG CTG TTG ATC GTC CGC GGT AAT AAA TTA 351 

AGC AAA GCG CAA GAA AAT GCC GCC TGG TTT ACT GCG CTT 390 

GCG AAT CGC AGC GTG CAG GTG ACC TGT CAG ACA CCG GAG 429 

CAG GCT CAG CTT CCC CGC TGG GTT GCT GCG CGC GCA AAA 468 

3 0 CAG CTC AAC TTA GAA CTG GAT GAC GCG GCA AAT CAG GTG 507 

CTC TGC TAC TGT TAT GAA GGT AAC CTG CTG GCG CTG GCT 546 

CAG GCA CTG GAG CGT TTA TCG CTG CTC TGG CCA GAC GGC 585 

AAA TTG ACA TTA CCG CGC GTT GAA CAG GCG GTG AAT GAT 624 

GCC GCG CAT TTC ACC CCT TTT CAT TGG GTT GAT GCT TTG 663 

3 5 TTG ATG GGA AAA AGT AAG CGC GCA TTG CAT ATT CTT CAG 702 

CAA CTG CGT CTG GAA GGC AGC GAA CCG GTT ATT TTG TTG 741 

CGC ACA TTA CAA CGT GAA CTG TTG TTA CTG GTT AAC CTG 780 

AAA CGC CAG TCT GCC CAT ACG CCA CTG CGT GCG TTG TTT 819 

GAT AAG CAT CGG GTA TGG CAG AAC CGC CGG GGC ATG ATG 858 

4 0 GGC GAG GCG TTA AAT CGC TTA AGT CAG ACG CAG TTA CGT 897 

CAG GCC GTG CAA CTC CTG ACA CGA ACG GAA CTC ACC CTC 936 
AAA CAA GAT TAC GGT CAG TCA GTG TGG GCA GAG CTG GAA 975 
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GGG TTA TCT CTT CTQ TTG TGC CAT AAA CCC CTG GCG GAC 1014 
GTA TTT ATC GAC GGT TGA 1032 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CCGAACAGCT GATTCGTAAG CTGCCAAGCA TCCGTGCTQC GGATATTCGT 50 
TCCGACGAAG AACAGACGTC GACCACAACG GATACTCCGG CAACGCCTGC 100 
ACGCGTCTCC ACCACGCTGG GTAACTG 127 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TGA ATGAAATCT ITACAGGCTC TGTTTGGCGG CACCTTTGAT 43 

CCGGTGCACT ATGGTCATCT AAAACCCGTT GGAAGCGTGG CCGAAGTTTT 93 
GATTGGTCTG AC 105 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 343 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met lie Arg Leu Tyr Pro Glu Gin Leu Arg Ala Gin Leu Asn Glu 

5 10 15 

Gly Leu Arg Ala Ala Tyr Leu Leu Leu Gly Asn Asp Pro Leu Leu 

20 25 30 

Leu Gin Glu Ser Gin Asp Ala Val Arg Gin Val Ala Ala Ala Gin 

35 40 45 

Gly Phe Glu Glu His His Thr Phe Ser He Asp Pro Asn Thr Asp 

50 55 60 
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Trp Asn Ala He Phe Ser Leu Cys Gin Ala Met Ser Leu Phe Ala 
65 70 75 

Ser Arg Gin Thr Leu Leu Leu Leu Leu Pro Glu Asn Gly Pro Asn 
80 85 90 

Ala Ala He Asn Glu Gin Leu Leu Thr Leu Thr Gly Leu Leu His 
95 100 105 

Asp Asp Leu Leu Leu He Val Arg Gly Asn Lys Leu Ser Lys Ala 
110 115 120 

Gin Glu Asn Ala Ala. Trp Phe Thr Ala Leu Ala Asn Arg Ser Val 
125' 130 135 

Gin Val Thr Cys Gin Thr Pro Glu Gin Ala Gin Leu Pro Arg Trp 
140 145 150 

Val Ala Ala Arg Ala Lys Gin Leu Asn Leu Glu Leu Asp Asp Ala 
155 160 165 

Ala Asn Gin Val Leu Cys Tyr Cys Tyr Glu Gly Asn Leu Leu Asn 
170 175 180 

Leu Ala Gin Ala Leu Glu Arg Leu Ser Leu Leu Trp Pro Asp Gly 
185 , 190 195 

Lys Leu Thr Leu Pro Arg Val Glu Gin Ala Val Asn Asp Ala Ala 
) 200 205 210 

His Phe Thr Pro Phe His Trp Val Asp Ala Leu Leu Met Gly Lys 
215 220 225 

Ser Lys Arg Ala Leu His He Leu Gin Gin Leu Arg Leu Gly Gly 
230 235 240 

i Ser Glu Pro Val He. Leu Leu Arg Thr Leu Gin Arg Glu Leu Leu 

245 250 255 

Leu Leu Val Asn Leu Lys Arg Gin Ser Ala His Thr Pro Leu Arg 
260 265 270 

Ala Leu Phe Asp Lys His Arg Val Trp Gin Asn Arg Arg Gly Met 
) 275 280 285 

Met Gly Glu Ala Leu Asn Arg Leu Ser Gin Thr Gin Leu Arg Gin 
290 295 300 

Ala Val Gin Leu Leu Thr Arg Thr Glu Leu Thr Leu Lys Gin Asp 
. 305 310 315 

5 Tyr Gly Gin Ser Val Trp Ala Glu Leu Glu Gly Leu Ser Leu Leu 

320 325 330 

Leu Cys His Lys Pro Leu Ala Asp Val Phe He Asp Gly 
335 340 343 

(2) INFORMATION FOR SEQ ID NO: 10: 

0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Arg Trp Tyr Pro Trp Leu Arg Pro Asp Phe Glu Lys Leu Val 

5 • 10 15 

Ala Ser Tyr Gin Ala Gly Arg Gly His His Ala Leu Leu He Gin 
20 25 30 

Ala Leu Pro Gly Met Gly Asp Asp Ala Leu He Tyr Ala Leu Ser 
35 40 45 

Arg Tyr Leu Leu Cys Gin Gin Pro Gin Gly His lys Ser Cys Gly 
50 55 60 

His Cys Arg Gly Cys Gin Leu Met Gin Ala Gly Thr His Pro Asp 
65 70 75 

Tyr Tyr Thr Leu Ala Pro Glu Lys Gly Lys Asn Thr Leu Gly Val 
80 85 90 

Asp Ala Val Arg Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg 
95 100 105 

Leu Gly Gly Ala Lys Val Val Trp Val Thr Asp Ala Ala Leu Leu 
110 ' 115 120 

Thr Asp Ala Ala Ala Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro 
125 130 135 

Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr Arg Glu Pro Glu Arg 
140 t 145 150 

Leu Leu Ala Thr Leu Arg Ser Arg Cys Arg Leu His Tyr Leu Ala 
155 160 165 

Pro Pro Pro Glu Gin' Tyr Ala Val Thr Trp Leu Ser Arg Glu Val 
170 , 175 180 

Thr Met Ser Gin Asp Ala Leu Leu Ala Ala Leu Arg Leu Ser Ala 
185 190 195 

Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly Asp Asn Trp 
200 205 210 

Gin Ala Arg Glu Thr Leu Cys Gin Ala Leu Ala Tyr Ser Val Pro 
' 215 220 225 

Ser Gly Asp Trp Tyr Ser Leu Leu Ala Ala Leu Asn His Glu Gin 
230 235 240 

Ala Pro Ala Arg Leu His Trp Leu Ala Thr Leu Leu Met Asp Ala 
245 250 255 

Leu Lys Arg His His Gly Ala Ala Gin Val Thr Asn Val Asp Val 
260 265 270 

Pro Gly Leu Val Ala Glu Leu Ala Asn His Leu Ser Pro Ser Arg 
275 280 285 

Leu Gin Ala He Leu Gly Asp Val Cys His lie Arg Glu Gin Leu 
290 295 300 

Met Ser Val Thr Gly He Asn Arg Glu Leu Leu He Thr Asp Leu 
305 310 315 
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Leu Leu Arg He Glu His Tyr Leu Gin Pro Gly Val Val Leu Pro 
320 325 330 

Val Pro His Leu 
334 

(2) INFORMATION FOR SEQ ID NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:- 57 nucleic acids 

(B) TYPE:* nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 1: 

ACTCTGGAAG AACCGCCGGC TGAAACTTGG TTTTTTCTGG CTACTCGTGA 50 
ACCGGAA 57 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GCTGGTTCTC CGGGTGCTGC TCTCGCTCTG TTTCAGGGTG ATGACTGGCA 50 
GGCT 54 

(2) INFORMATION FOR SEQ ID NO: 13: • 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 bp 

(B) tYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG AGA TGG TAT CCA TGG TTA CGA CCT GAT TTC GAA AAA 39 

CTG GTA GCC AGC TAT CAG GCC GGA AGA GGT CAC CAT GCG 78 

CTA CTC ATT CAG GCG TTA CCG GGC ATG GGC GAT GAT GCT 117 

TTA ATC TAC GCC CTG AGC CGT TAT TTA CTC TGC CAA CAA 156 

CCG CAG GGC CAC AAA AGT TGC GGT CAC TGT CGT GGA TGT 195 

CAG TTG ATG CAG GCT GGC ACG CAT GCC GAT TAC TAC ACC 234 

CTG GCT CCC GAA AAA GGA AAA AAT ACG CTG GGC GTT GAT 273 
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GCG GTA CGT GAG GTC ACC GAA AAG CTG AAT GAG CAC GCA 312 

CGC TTA GGT GGT GCG AAA GTC GTT TGG GTA ACC GAT GCT 351 

GOG TTA CTA ACC GAC GCC GCG GCT AAC GCA TTG CTG AAA 390 

AOG CTT GAA GAG CCA CCA GCA GAA ACT TGG TIT TTC CTG 429 

5 GCT ACC CGC GAG CCT GAA CGT TTA CTG GCA ACA TTA CGT 468 

AGT CGT TGT CGG TTA CAT TAC CTT GCG CCG CCG CCG GAA 507 

CAG TAC GCC GTG ACC TGG CTT TCA CGC GAA GTG ACA ATG 546 

TCA CAG GAT GCA TTA 1 CTT GCC GCA TTG CGC TTA AGC GCC 585 

GGT TCG CCT GGC GCG -GCA CTG GCG TTG TTT CAG GGA GAT 624 

10 AAC TGG CAG GCT CGT GAA ACA TTG TGT CAG GCG TTG GCA 663 

TAT AGC GTG CCA TCG GGC GAT TGG TAT TCG CTG CTA GCG 702 

GCC CTT AAT CAT GAA CAA GTC CCG GCG CGT TTA CAC TGG 741 

CTG GCA ACG TTG CTG ATG GAT GCG CTA AAA CGC CAT CAT 780 

GGT GCT GCG CAG GTG ACC AAT GTT GAT GTG CCG GGC CTG 819 

15 GTC GCC GAA CTG GCA AAC CAT CTT TCT CCC TCG CGC CTG 858 

CAG GCT ATA CTG GGG GAT GTT TGC CAC ATT CGT GAA CAG 897 

TTA ATG TCT GTT ACA GGC ATC AAC CGC GAG CTT CTC ATC 936 

ACC GAT CTT TTA CTG CGT ATT GAG CAT TAC CTG CAA CCG 975 
GGC GTT GTG CTA CCG GTT CCT CAT CTT 1002 

2 0 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 bp 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AAGAATCTTT CGATTTCTTT AATOGCACCC GCGCCCGCTA TCTGGAACTG 50 

GCAGCACAAG ATAAAAGCAT TCATACCATT GATGCCACCC AGCCGCTGGA 100 

3 0 GGOCGTGATG GATGCAATCC GCACTACCGT GACCCACTGG GTGAAGGAGT 150 

TGGACGC 157 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 bp 
3 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
40 TTA GAGAGACATC ATG'ITTTTAG TCGACTCACA CTGCCATCTC 43 
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GATQGTCTGG ATTATGAATC TTTGCATAAG GACGTGGATG ACGTTCTGGC 93 
GAAAGCCGCC GCACGCGATG TGAAATTTTG TCTGGCAGTC GCCACAACAT 143 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE tYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
1 0 Met Arg Trp Tyr Pro Pro Leu Arg Pro Asp Phe Glu Lys Leu Val Ala 

5 10 15 

(2) INFORMATION FOR SEQ ID NO: 1 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 1 amino acids 
\ 5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
Glu Val Thr Glu Lys Leu Asn Glu His Ala Arg; 
20 5 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Val Val Trp Va} Thr Asp Ala Ala Leu Leu Thr Asp Ala Ala Ala 
5 10 15 

3 0 Asn Ala Leu Leu Lys 

20 . 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 
3 5 (B) TYPE: amino acid 

■(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Thr Leu Glu Glu Pro Pro Ala Glu Thr Trp Phe Phe Leu Ala Thr 
5 10 15 

Arg Glu Pro Glu Arg Leu Leu Ala Thr Leu 
5 20 25 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Leu His Tyr Leu Ala Pro Pro Pro Glu Gin Tyr Ala Val Thr Trp 

5 10 15 

1 5 Leu Ser Arg 
18 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 
20 (B) TYPE: • amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
Leu Ser Ala Gly Ser Pro Gly Ala Ala Leu Ala Leu Phe Gin Gly 
25 5 10 15 

Asp Asn Trp Gin Ala Arg. 

20 

(2) INFORMATION FOR SEQ ID NO: 22: 

(0 SEQUENCE CHARACTERISTICS: 
3 0 (A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

3 5 Leu Gly Gly Ala Lys 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Ma Cys Thr Cys Thr 1 Gly Gly Ala Ala Gly Ala Ala Cys Cys Gly 
' 5, 10 15 

Cvs Cys Gly Gly Cys Thr Thr Gly Ala Ala Ala Cys Thr Thr Gly 
10 10 25 30 

Gly Thr Thr Thr Thr Thr Thr Cys Thr Gly Gly Cys Thr Ala Cys 
35 40 45 

Thr Cys Gly Thr Gly Ala Ala Cys Cys Gly Gly Ala Ala 
50 55 

1 5 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GlV Cys Thr Gly Gly Thr Thr Cys Thr Cys Cys Gly Gly Gly Thr 

5 10 15 

Gly Cys Thr Gly Cys Thr Cys Thr Gly Gly Cys Thr Cys Thr Gly 
25 20 25 30 

Thr Thr Thr Cys Ala Gly Gly Gly Thr Gly Ala Thr Ala Ala Cys 
35 40 45 

Thr Gly Gly Cys Ala Gly Gly Cys Thr 
50 

30 (2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
Gly Gly Thr Gly Ala Ala Gly Gly Ala Gly Thr Thr Gly Gly Ala 

5 10 15 



Cys Ala Thr Ala Thr, Gly Ala Gly Ala Thr Gly Gly Thr Ala Thr 
20 25 30 

Cys Cys Ala 

(2) INFORMATION FOR SEQ ID NO: 26: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE? amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

1 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 

5 10 15 

Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Arg 
20 25 30 

1 5 Tyr Asn Met Pro Val He Ala Glu Ala Val 

35 40 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 
20 (B) TYPE: , nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

2 5 ATGCTGAAAA ACCTGGCTAA ACTGGATCAG ACTGAAATGG ATAAAGTTAA 50 

CGTTGAT 57 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 
30 (B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

3 5 CTGGCTGCTG CTGGTGTTGC TITI'AAGGAA CGTTATAACA TGCCGGTTAT 50 

TGCTGAA 57 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 228 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATG CTG AAG AAT CTG GCT AAA CTG GAT CAA ACA GAA ATG 39 
GAT AAA GTG AAT GTC 1 GAT TTG GCG GCG GCC GGG GTG GCA 78 
TTT AAA GAA CGC TAC AAT ATG CCG GTG ATC GCT GAA GCG 117 
10 GTT GAA CGT GAA CAG CCT GAA CAT TTG CGC AGC TGG TTT 156 
CGC GAG CGG CTT ATT GCC CAC CGT TTG GCT TCG GTC AAT 195 
CTG TCA CGT TEA CCT TAC GAG CCC AAA CTT AAA 228 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

1 5 (A) LENGTH: 172 bp 

(B) TYPE: , nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

2 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AG GCGTAGCGAA GGGAGCGTGC AGTTGAAGCC ATATTATCTA 42 
TICCTTTTTG TAATAACTTT TTTACAGACG ATAACCTTGT CTAATGTCTG 92 
AGTCGAGGAT CATCAATTCC GGCTTGCCAT CCTGGCTCAC TCTTAGTAAC 142 
TTTTGCCCGC GAATGATGAG GAGATTAAGA 172 

2 5 (2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 bp 

(B) TYPE: nucleic acid 

(C) Sf RAN0NESS: single 

3 0 (D) TOPOLOGY: linear 

: (ii) MOLECULE tVPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

TAA AACTTATAC AGAGTTACAC TTTCTTACAT AACGCCTGCT 42 
AAATTATGAG TATTTTCTAA ACO^CACTCA TAATTTGCAG TCATTTTGAA 92 
3 5 AAGGAAGTCA TTATG 107 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



Met 


Leu Lys Asn Leu Ala Lys Leu Asp 


Gin 


Thr Glu Met Asp 


Lys 




5 


10 




15 


Val 


Asn Val Asp Leu Ala Ala Ala Gly 


Val 


Ala Phe Lys Glu 


Arg 




£0 


25 




30 


Tyr 


Asn Met Pro Val > He Ala Glu Ala 


Val 


Glu Arg Glu Gin 


Pro 




35' 


40 




45 


Glu 


His Leu Arg Ser Trp Phe Arg Glu 


Arg 


Leu He Ala His 


Arg 




50 


55 




60 


Leu 


Ala Ser Val Asn Leu Ser Arg Leu 


Pro 


Tyr Glu Pro Lys 


Leu 




65 


70 




75 



Lys 
15 76 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Leu Lys Asn Leu Ala Lys Leu Asp Gin Thr Glu Met Asp Lys 
5 ■ 10 15 

2 5 Val Asn Val Asp Leu Ala Ala Ala Gly Val Ala Phe Lys Glu Ala 

20 25 30 

Tyr Asn Met Pro Val He Ala Glu Ala Val 
,35 40 

(2) INFORMATION FOR SEQ ID NO: 34: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

ATG CTG AAA AAC CTG GOT AAA CTG GAT CAG ACT GAA ATG GAT 42 
AAA GTT AAC GTT GAT 57 

(2) INFORMATION FOR SEQ ID NO: 35: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CTG OCT GCT GCT GGT-GTT GOT TTT AAA GAA CGT TAT AAC ATG 42 
CCG GTT ATT GCT GAA ' 57 

1 0 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

ATGATGAGGA GATTACATAT GCTGAAGAAT CTG 33 

(2) INFORMATION FOR SEQ ID NO: 37: 

! 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: double 

(D) TOPOLOGY: hook 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

TTI03GCTTAAGGAG 

TTTGCCGAATTCCTCX^ 56 

(2) INFORMATION FOR SEQ ID NO: 38: 
% 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: ' amino acid 
(D) TOPOLOGY: v linear 

(ii) MOLECULE TYPE: peptide 

3 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Met Thr Ser Arg Arg Asp Trp Gin Leu Gin Gin Leu Gly He Thr 

5 10 15 
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Gin Trp Ser Leu Arg Arg Pro Gly Ala Leu Gin Gly Glu He Ala 
20 25 30 

He Ala He Pro Ala His Val Arg Leu Val Met Val Ala Asn Asp 
35 ' . 40 45 

5 Leu Pro Ala Leu Thr Asp Pro Leu Val Ser Asp Val Leu Arg Ala 

50 55 60 

Leu Thr Val Ser Pro Asp Gin Val Leu Gin Leu Thr Pro Glu Lys 
155 ' 70 75 

He Ala Met Leu Pro 1 Gin Gly Ser His Cys Asn Ser Trp Arg Leu 
10 80' 85 90 

Gly Thr Asp Glu Pro Leu Ser Leu Glu Gly Ala Gin Val Ala Ser 
95 100 105 

Pro Ala Leu Thr Asp Leu Arg Ala Asn Pro Thr Ala Arg Ala Ala 
110 115 120 

1 5 Leu Trp Gin Gin He Cys Thr Tyr Glu His Asp Phe Phe Pro Gly 

125 130 135 

Asn Asp 
137 

(2) INFORMATION FOR SEQ ID NO: 39: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 1 bp 

(B) TYPE: • nucleic acid 

(C) STRANDEDNESSS: single 

(D) TOPOLOGY: linear 

15 (H) Molecule type: dna 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATG ACA TCC CGA CGA GAC TGG CAG TTA CAG CAA CTG GGC 39 

ATT ACC CAG TGG TCG CTG CGT CGC CCT GGC GCG TTG CAG 78 

GGC GAG ATT GCC ATT GCG ATC CCG GCA CAC GTC CGT CTG 117 

30 GTG ATG GTG GCA AAC GAT CTT CCC GCC CTG ACT GAT CCT 156 

TTA GTG AGC GAT GTT CTG CGC GCA TTA ACC GTC AGC CCC 195 

GAG CAG GTG CTG CAA CTG ACG CCA GAA AAA ATC GCG ATG 234 

CTG CCG CAA GGC AGT CAC TGC AAC AGT TGG CGG TTG GGT 273 

ACT GAC GAA CCG CTA TCA CTG GAA GGC GCT CAG GTG GCA 312 

3 5 TCA CCG GCG CTC ACC .GAT TTA CGG GCA AAC CCA ACG GCA 351 

CGC GCC GCG TTA TGG CAA CAA ATT TGC ACA TAT GAA CAC 390 
GAT TTC TTC CCT GGA AAC GAC 411 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 77 bp 

(B) TYPE: hucleic acid 
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(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

5 GQCGATTATA GCCATATGTT GGCGCGGTA CGACGAATTT GCTATATTIG 50 
CGCCCCTGAC AACAQGAGCG ATTOGCT 77 

(2) INFORMATION FOR .SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 bp 

10 (B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

1 5 TGA TTTACCGGCA GCTTACCACA TTGAACAACG CGCCCACGCC 43 

TTTCCGTGGA GTGAAAAAAC GTTTGCCAGC AACCAGGGCG AGCGTTATCT 93 
CAAGTTTCAG 103 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 27 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GATTCCATAT GACATCCCGA CGAGACT 27 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

3 0 (B) TYPE: nucleic acid 

(C) STRANDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
3 5 GACTGGATCC CTGCAGGCCG GTGAATGAGT 30 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Lett Gly Thr Asp Glu Pro Leu Ser Leu Glu Glu Ala Gin Val Ala 
. 1 3 10 15 

Ser Pro 
17 

1 0 (2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Ala Ala Leu Trp Gin Gin He Cys Thr Tyr Glu His Asp Phe Phe 

5 10 15 

Pro Ala 

2 0 17 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 bp 

(B) tYPE: nucleic acid 
25 (C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(H) Molecule type: dna 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CAACAGGAGC GAtTCCATAT GACATCCCGACG 32 

30 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

3 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GATTCGGATC CCTGCAGGCC GGTGAATGAG T 31 
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(2) INFORMATION FOR'SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

(B) TYPE: nucleic acid 
5 (C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CCCCACATAT GAAAAACGCG ACGTTCTACC 30 

1 0 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ACCCGGATCC AAACTGCCGG TGACATTC 28 

(2) INFORMATION FOR SEQ ID NO: 50: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 bp 

(B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

ATG AAA AAC GCG ACG TIC TAC CTT CTG GAC AAT GAC ACC 39 

ACC GTC GAT GGC TEA AGC GCC GTT GAG CAC CTG GTG TGT 78 

GAA ATT GCC GCA GAA CGT TGG CGC AGC GGT AAG CGC GTG 117 

3 0 CTC ATG GCC TGT GAA GAT GAA AAG CAG GCT TAC GCC CTG 156 

GAT GAA GCC CTG TGG GCG CGT CCG GCA GAA AGG TTT GTT 195 

COG CAT AAT TEA GCG GGA GAA GGA CCG CGC GGC GGT GTA 234 

CCG GTG GAG ATC GCC TGG CCG CAA AAG CGT AGC AGC AGC 273 

COG CGC GAT ATA TTG ATT AGT CTG CGA ACA AGC TTT GCA 312 

3 5 GAT TTT GCC ACC GCT TTT ACA GAA GTG GTA GAC TTC GTT 351 

CCT CAT GAA GAT TCT CTG AAA CAA CTG GCG CGC GAA CGC 390 

TAT AAA GCC TAC CGC GTG GCT GGT TTC AAC CTG AAT ACG 429 
GCA ACT TGG AAA 441 

(2) INFORMATION FOR SEQ ID NO: 51: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 bp 

(B) TYPE: nucleic acid 

(C) STRANbNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE, TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

TMCGGCGAA GAGTAATTGC GTCAGGCAAG GCTGTTATTG CCGGATGCGG 50 
CGTGAACGCC TTATCCGACC TACACAGCAC TGAACTCGTA GGCCTGATAA 100 
1 0 GACACAACAG CGTCGCATCA GGCGCTGCGG TGTATACCTG ATGCGTATTT 150 
AAATCCACCA CAAGAAGCCC CATTT 175 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 bp 

1 5 (B) TYPE: nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

2 0 TAA TGGAAAA GACATATAAC CCACAAGATA TCGAACAGCC GCTTTACGAG 50 

CACTGGGAAA AAAGCCAGGA AAGTTTCTGC ATCATGATCC CGCCGCCGAA 100 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 amino acids 

2 5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met Lys Asp Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 
30 5 10 15 

Asp Gly Leu Ser Ala Val Glu Gin Leu Val Cys Glu He Ma Ma 
20 25 30 

Glu TArg Trp Arg Ser Gly Lys Arg Val Leu He Ma Cys Glu Asp 
35 40 45 

3 5 Glu Lys Gin Ma Tyr Arg Leu Asp Glu Ma Leu Trp Ma Arg Pro 

50 . 55 60 

Ma Glu Ser Phe Val Pro His Asn Leu Ma Gly Glu Gly Pro Arg 

65 70 75 

Gly Gly Ma Pro Val Glu He Ma Trp Pro Gin Lys Arg Ser Ser 

4 0 80 85 90 
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Ser Arg Arg Asp lie Leu lie Ser Leu Arg Thr Ser Phe Ala Asp 
95 100 105 

Phe Ala Thr Ala Phe Thr Glu Val Val Asp Phe Val Pro Tyr Glu 
110 115 120 

5 Asp Ser Leu Lys Gin Leu Ala Arg Glu Arg Tyr Lys Ala Tyr Arg 

125 130 135 

Val Ala Gly Phe Asn Leu Asn Thr Ala Thr Trp Lys 
140 145 147 

(2) INFORMATION FOR SEQ ID NO: 54: 

1 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

1 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Lys Asn Ala Thr Phe Tyr Leu Leu Asp Asn Asp Thr Thr Val 
5 ' 10 15 

Asp Gly Leu Ser Ala Val Glu Gin Leu Val Xxx Glu lie Ala 

20 25 

2 0 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE tYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Val Leu lie Ala Xxx Glu Asp Glu Lys 
5 

(2) INFORMATION FOR SEQ ID NO: 56: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGtH: 20 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

3 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Leu Asp Glu Ala Leu Trp Ala 'Ala Pro Ala Glu Ser Phe Val Pro 
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(2) INFORMATION FOftSEQ ID NO: 57: 

(i) SEQUENCE: CHARACTERISTICS: 

(A) LlrNCatHi 16 amlHo acids . 

(B) TYPEi artlirio acid 
5 (DjtOPbLoaV: linear 

(ii) MOLECULE fYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Gly Gly Ala Pro Val'Glu He Ala Trp Pro 

5' 10 

10 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Gly Phe Asn Leu Asn Thr Ala Thr 
5 

20 (2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bp 

(B) TYPE: . nucleic acid 

(C) STRANDNESS: Single 
25 (D) TOPOLOGY: linear 

(ii) Molecule tYPE: dna 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

CCCCACATAT GAAAAAOGCG ACGTTCTA.CC 30 

(2) INFORMATION FOR SEQ ID NO: 60: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 bp 

(B) TYPE: ' nucleic acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: DNA 

m\ spni ipKip.p np.QriRiPTinM- QPn in Km- «n- 
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Thus, while I have illustrated and described the preferred 
embodiment of my invention, it is to be Understood that this invention 
is capable of variation and modification, and I therefore do not wish to 
be limited to the precise terms set forth, but desire to avail myself of 
5 such changes and alterations which may be made for adapting the 
invention to various, usages and conditions. Accordingly, such changes 
and alterations are" properly intended to be within the full range of 
equivalents, and therefore within the purview of the following claims. 

Having thus described my invention and the manner and a process 
1 0 of making and using it in such full, clear, concise and exact terms so as 
to enable any person skilled in the art to which it pertains, or With 
which it is most nearly connected, to make and use the same; 
I claim: 
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1. A genetic sequence for an isolated subunit for a DNA 
polymerase III gene selected from the subunit §tbiip consisting of 5, 8\ 
X, 9, and V genes. 

2. An isolated and purified enzyme showing the activity of the 
10 Subunit DNA polymerase III holoenzyme which contains only the 
polymerase peptide isUbunits for a, e, 8, 8, and y subunits of the naturally 
occurring holoenzyme] 

3. The enzyme according to Claim 2 which further contains the 
polymerase peptide subunit 8' of the naturally occurring holoenzyme. 

4. An isolated and purified peptide selected from the group of 
the 8, 8', X, 9, and V DNA polymerase III holoenzyme peptides . 
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COMBINED DECLARATION FOR PATENT APPLICATION AND POWER OF 

ATTORNEY 



Docket No.: CRFD-1156A 

As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name; 

I believe I am the original, first and sole inventor (if only one name is listed below) or 
an original, first and joint inventor (if plural names are listed below) of the subject 
matter which is claimed and for which a patent is sought on the invention entitled: 

DNA POLYMERASE III HOLOENZYME 

the specification of which 

[ ] is attached hereto 

[X ] was filed on July 22nd 1994 as United States Patent Application 

08/279,058 as a Continuation in Part of earlier filed United States Patent 
Application 07/826,926 filed on January 24th 1992. 

I hereby state that I have reviewed and understand the contents of the above identified 
specification, including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the examination of 
this application in accordance with Title 37, Code of Federal Regulations, 1.56(a). 

I hereby claim foreign priority benefits under Title 35, United States Code, 119 of any 
foreign applications for patent or inventor's certificate listed below and have also 
identified below any foreign application for patent or inventor's certificate having a 
filing date before that of the application on which priority is claimed: 

Prior Foreign Applications: 

No.: Country: Filing Date: Priority Claim: 

PCT US93/00627 January 22nd 1993 yes 

I hereby claim the benefit under Title 35, United States Code, 120 of any United States 
application(s) listed below and, insofar as the subject matter of each of the claims of 
this application is not disclosed in the prior United States application in the manner 
provided by the first paragraph of Title 35 United States Code 112, I acknowledge the 
duty to disclose material information as defined in Title 37 Code of Federal Regulations, 
1 .56(a) which occurred between the filing date of the prior application and the national 
or PCT international filing date of this application: 



Application Serial No.: Filing Date: Status: 
07/826,926 January 24th 1992 abandoned 



2 



I hereby appoint the following attorney(s) to prosecute this application and to transact 
all business in the Patent and Trademark Office in connection therewith: 



Address all telephone calls to: George M. Yahwak at telephone No.: (203)268-1951. 



I hereby declare that all statements made herein of my own knowledge are true and that 
all statements made on information and belief are believed to be true; and further that 
these statements were made with the knowledge that willful false statements and the like 
so made are punishable by fine or imprisonment, or both, under section 1001 of Title 
18 of the United States Code and that such willful false statements may jeopardize the 
validity of the application or any patent issued thereon. 

First inventor: Michael E. O'Donnell 

Residence: Hastings on Hudson, New York Citizenship: USA 
Post Office Address: 16 Maple Lane, Hastings on Hudson, New York 10706 



George M. Yahwak (Registration No. 26,824) 



Address all correspondence to: 



Yahwak & Associates 

25 Skytop Drive 

Trumbull, Connecticut 06611 




Date: 




